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^ (57) Abstract: The present invention features methods for analyzing a sequence of a target polynucleotide by detecting incorpora- 
don of a nucleotide into its complementary strand, where the polynucleotides may be bound at high density and at single molecule 
^ resolution. The invention also features labeling moieties and blocking moieties, which facilitate chain tern nation or choking. Certain 
Q aspects provide for temporal detection of the incorporations; some allow for asynchronous analysis of a plurality of target polynu- 
^ cleotides and the use of short sequencing cycles. Surface chemistry aspects of the sequencing methods are also provided. The method 
may also be used in kits, said kits designed to carry out and facilitate the methods provided herein. 
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Reference to Related Applications 

[0001] This non-provisional patent application claims the benefit of and priority to U.S. provisional 
5 application no. 60/546,277, filed February 19, 2004, and U.S. provisional application no. 60/547,61 1, 
filed February 24, 2004. Reference also is made to U.S. non-provisional application no. 09/605,520, 
filed June 27, 2000; U.S. provisional patent application no. 60/141,503, filed June 28, 1999; U.S. 
provisional patent application no. 60/147,199, filed August 3, 1999; U.S. provisional patent 
application no. 60/163,742, filed November 4, 1999; U.S. provisional patent application no. 
1 0 60/1 86,856, filed March 3 , 2000, and U.S. provisional patent application no. 60/275,232, filed March 
12, 2001; U.S. non-provisional application no. 09/707,737, filed November 6, 2000; U.S. non- 
provisional application no. 09/908,830, filed July 18, 2001; and U.S. non-provisional application no. 
10/099,459, filed March 12, 2002; as well as U.S. provisional application no. 60/519,862, filed 
November 12, 2003. The text of each of the foregoing patent applications is hereby incorporated by 
1 5 reference. 

Technical Field of the Invention 

[0002] The invention generally relates to methods for analyzing the sequence of a target 
polypeptide. More particularly, the invention involves detecting incorporation of a nucleotide into the 
complementary strand of the target polypeptide. 

20 Background of the Invention 

[0003] Genetic sequencing finds many important applications in biotechnology, genetics, and 
pharmacology, as well as medical diagnoses and therapeutic treatments. For example, sequencing 
individual genomes and individual cells can be used to determine genetic variability, disease 
susceptibility and pharmaceutical efficacy. While earlier methods have proved useful in these 

25 applications, there remains a need in the art for even better methods of analyzing genetic information. 
Summary of the Invention 
I. INTRODUCTION 

[0004] The present invention provides methods and kits for analyzing the sequence of a target 
polypeptide by detecting incorporation of a nucleotide into its complementary strand Certain 

30 embodiments provide for detection of a single nucleotide into a single target polynucleotide. Some 
embodiments use labeling moieties that facilitate chain termination or choking. Some embodiments 
use separate labeling and blocking moieties, but still allow single step reversal of chain termination 
and reduction of incorporated signals. Some embodiments use bleachable labeling moieties, whose 
signal can be reduced without cleavage of the structural moiety. Some embodiments use quenched 

35 labeling moieties, which become detectable upon incorporation and/or upon further reaction. Certain 
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aspects provide for allowing successive incorporations of a number of nucleotides on a support; other 
aspects allow for temporal detection of the incorporations. 



[0005] Certain embodiments of the present invention are directed to analysis of a plurality of target 
polynucleotides in parallel. For example, methods of parallel analysis of a plurality of polynucleotide 
molecules randomly bound to a substrate are provided. In certain embodiments, the polynucleotide 
molecules are bound at high density and at single molecule resolution. Moreover, certain 
embodiments allow for asynchronous analysis of the plurality of target polynucleotides and the use of 
short sequencing cycles. 

[0006] The present invention also provides numerous applications of the sequencing and analysis 
methods. Some embodiments provide for identifying the address of a polynucleotide molecule 
randomly bound to a substrate, while some embodiments provide for counting copies of identified 
molecules. 

[0007] Certain aspects of the invention relate to analyzing DNA sequences and applications 
corresponding thereto. For example, some embodiments provide for identifying a mutation useful, for 



example, in diagnosis and/or prognosis of conditions such as cancer. Certain embodiments provide 
methods of doing genetic cancer research, for example, by identifying changes in cell diploidy. 
[0008] Other aspects of the invention relate to analyzing RNA sequences and applications 
corresponding thereto. Such embodiments include methods for enumerating copy number of RNA 
transcripts, methods for identifying alternate splice sites, andanethods for analyzing the RNA 
sequences of a cell in parallel. These methods find use in a number of applications also provided 
herein, including identifying unknown RNA molecules, annotating genomes based on transcribed 
sequences, and determining phylogenic relationships of various species. Other embodiments provide 
for determining cellular responses to different stimuli, while still other embodiments provide for 
compiling transcriptional patterns of cells in different stages of cellular differentiation, thereby 
facilitating methods of tissue engineering. 

[0009] Yet other aspects of the present invention relate to surface chemistry. Some such 
embodiments provide substrates and methods for hindering an anchored polynucleotide from lying 
down, as well as for reducing background fluorescence when detecting fluorescently-labeled 
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nucleotides incorporated into the complementary strand. Moreover, some of these ei 
permit high density anchoring of polynucleotide molecules at single molecule resolution. 

n. Aspects of the Present Invention 

A. Fluorescent Single Base Extension on a Substrate 

[0010] In one aspect, the present invention provides methods for analyzing the sequence of a target 
polynucleotide. The methods include the steps of: 

[001 1] (a) providing a primed target polynucleotide immobilized to a surface of a substrate; 
wherein die target polynucleotide is attached to the surface with single molecule resolution; 
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[0012] (b) in the presence of a polymerase, adding a first fluorescently labeled nucleotide to the 
surface of the substrate under conditions whereby the first nucleotide attaches to the primer, if a 
complementary nucleotide is present to serve as template in the target polynucleotide; 
[0013] (c) determining presence or absence of a fluorescence signal on the surface where the 
target polynucleotide is immobilized, the presence of a signal indicating that the first nucleotide was 
incorporated into the primer, and hence the identity of the complementary base that served as a 
template in the target polynucleotide; and 

[0014] (d) repeating steps (b) - (c) with a further fluorescently labeled nucleotide, the same or 
different from the first nucleotide, whereby the further nucleotide attaches to the primer or a 
nucleotide previously incorporated into the primer. 

B. Choking (including Sanger-like Sequencing using Choking Moieties) 

[0015] Some embodiments of the invention provide methods for analyzing a sequence of a target 
polynucleotide by synthesizing a complementary strand, comprising: providing a labeled nucleotide, 
said labeled nucleotide comprising a labeling moiety hindering further chain elongation by steric 
hindrance; allowing incorporation of said nucleotide into said complementary strand in the presence 
of a polymerizing agent; and detecting incorporation, thereby analyzing said sequence of said target 
polynucleotide. Methods also may be used in kits, said kits designed to carry out and facilitate the 
methods provided herein. 

C. Single Step Bleaching & Cleaving 

[0016] Some embodiments of the invention provide methods for analyzing a sequence of a target 
polynucleotide by synthesizing a complementary strand, comprising: providing a labeled nucleotide, 
said labeled nucleotide comprising a labeling moiety and a blocking moiety, wherein said moieties are 
capable of being bleached and cleaved, respectively, in a single step of bleaching and cleaving; 
allowing incorporation of said nucleotide into said complementary strand in the presence of a 
polymerizing agent; and detecting incorporation, thereby analyzing said sequence of said target 
polynucleotide. Methods also may be used in kits, said kits designed to carry out and facilitate the 
methods provided herein. 

D. Noncleavable labeling moiety Approach 

[0017] Some embodiments of the invention provide methods for analyzing a sequence of a target 
polynucleotide by synthesizing a complementary strand, comprising: anchoring said target 
polynucleotide to a surface of a substrate; providing two or more types of labeled nucleotide, said 
labeled nucleotide comprising a non-cleavable labeling moiety and a blocking moiety; allowing 
incorporation of said nucleotide into said complementary strand in the presence of a polymerizing 
agent; and detecting incorporation; thereby analyzing said sequence of said target polynucleotide. 
Methods also may be used in kits, said kits designed to carry out and facilitate the methods provided 
herein. 

£. Non-a-Phosphate-Quenching 
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[0018] Some embodiments of the invention provide methods for analyzing a sequence of a target 
polynucleotide by synthesizing a complementary strand, comprising: providing a labeled nucleotide, 
said labeled nucleotide comprising a quenching moiety on at least one of a non-a- phosphate of said 
nucleotide and a fluorescent moiety; allowing incorporation of said nucleotides into said 
5 complementary strand in die presence of a polymerizing agent; and detecting incorporation, thereby 
analyzing said sequence of said target polynucleotide. Methods also may be used in kits, said kits 
designed to carry out and facilitate the methods provided herein. 

F. Asynchronous and Short-Cycle Sequencing 

[0019] Some embodiments of die invention provide methods for analyzing sequences of two or 
1 0 more target polynucleotides by asynchronously synthesizing two or more complementary stands in 
parallel, comprising: localizing said target polynucleotides on a surface of a substrate at individually- 
addressable locations; providing a labeled nucleotide, said nucleotide comprising a labeling moiety; 
allowing incorporation of said nucleotide into said complementary strands in the presence of a 
polymerizing agent wherein different numbers of said nucleotide may be incorporated into at least 
1 5 two of said complementary strands in a given period of time; detecting incorporation at said 

individually-addressable locations for said given period of time; thereby analyzing said sequences of 
said target polynucleotides. Methods may also be used in kits, said kits designed to carry out and 
facilitate the methods provided herein. 

[0020] Some embodiments of the invention also provide methods for analyzing a sequence of a 
20 target polynucleotide by synthesizing a complementary strand, comprising: > 
[0021] localizing said target polynucleotide on a surface of a substrate; providing a labeled 
nucleotide, said nucleotide comprising a labeling moiety; allowing a cycle of incorporation reactions 
of said nucleotide into said complementary strand in the presence of a polymerizing agent; halting 
said cycle after a period of time, said period permitting at least a chance of incorporation of two or 
25 less of said nucleotides into said complementary strand; and detecting incorporation, thereby 

analyzing said sequence of said target polynucleotide. Methods also may be used in kits, said kits 
designed to carry out and facilitate the methods provided herein. 

G. Movie Mode 

[0022] Some embodiments of the invention provide methods for analyzing a sequence of a target 
30 polynucleotide by synthesizing a complementary strand, comprising: providing four types of 
nucleotides wherein at least one of said types of nucleotides is a labeled nucleotide comprising a 
labeling moiety; allowing incorporation of said labeled nucleotide into said complementary strand in 
the presence of a polymerizing agent; and temporally detecting incorporation, thereby analyzing said 
sequence of said target polynucleotide. Methods may also be used in kits, said kits designed to carry 
35 out and facilitate the methods provided herein. 

H. Single Base Extension of Randomly Bound Molecule 
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[0023] Some embodiments of the invention provide methods for analyzing a sequence of a 
randomly-localized target polynucleotide by synthesizing a complementary strand, comprising: 
permitting random localization of said target porymicleotide on a surface of a substrate; providing a 
labeled nucleotide, said nucleotide comprising a labeling moiety; allowing incorporation of said 
5 nucleotide into said complementary strand in the presence of a polymerizing agent; and detecting 
incorporation, thereby analyzing said sequence of said target polynucleotide. Methods also may be 
used in kits, said kits designed to carry out and facilitate the methods provided herein. 
I. High Density Single Base Extension . 

[0024] Some embodiments of the invention provide methods for analyzing a sequence of a target 
1 0 polynucleotide at high density by synthesizing a complementary stand, comprising: permitting 

localization of said target polynucleotide on a surface of a substrate at a density of at least 1 ,000 target 
polynucleotides per cm 2 ; providing a labeled nucleotide, said nucleotide comprising a labeling 
moiety; allowing incorporation of said nucleotide into said complementary stand in the presence of a 
polymerizing agent; and detecting incorporation, thereby analyzing said sequence of said target 
1 5 polynucleotide. Methods also may be used in kits, said kits designed to carry out and facilitate the 
methods provided herein. 

J. Address Identification of Randomly Bound Molecule 

[0025] Some embodiments of the invention provide methods for identifying an address of a 
randomly-localized target polynucleotide, comprising: permitting random localization of said target 

20 polynucleotide on a surface of a substrate; providing a labeled nucleotide, said nucleotide comprising 
a labeling moiety, allowing hybridization of said labeled nucleotide to a complementary base of said 
target polynucleotide before or after said step of permitting random localization; and detecting said 
labeled nucleotide, thereby identifying said location of said randomly-localized target polynucleotide. 
Methods also may be used in kits, said kits designed to carry out and facilitate the methods provided 

25 herein. 

K. Achieving Sequencing of a Given Number of Bases on a Support 
[0026] Some embodiments of the invention provide methods of analyzing a number of bases of a 
sequence of a target polynucleotide by synthesizing a complementary strand, comprising: permitting 
localization of said target polynucleotide on a surface of a substrate; providing up to four types of 

30 nucleotides, at least one of said types comprising a labeling moiety and allowing incorporations of 
said number of said nucleotides into said complementary strand in the presence of a polymerizing 
agent wherein said number is at least six; and detecting said incorporations after incorporation of one 
or more of said number of said nucleotides, thereby analyzing said number of bases of said sequence 
of said target polynucleotide. Methods also may be used in kits, said kits designed to carry out and 

35 facilitate the methods provided herein. 

L. Polynucleotide Counting and Identification, and Applications Thereof 
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[0027] Some embodiments of the invention provide methods of enumer a ting a number of copies of 
a target polynucleotide by synthesizing a complementary stand, comprising: permitting random 
localization of said target polynucleotide on a surface of a substrate at an individually-addressable 
location; providing a labeled nucleotide, said nucleotide comprising a labeling moiety, allowing 
incorporation of said nucleotide into said complementary strand in the presence of a polymerizing 
agent; detecting incorporation; repeating said providing, said allowing, and said detecting steps a 
number of times sufficient to identify a copy of said target polynucleotide; and counting said 
identified copies, thereby enumerating said number of copies of said target polynucleotide. Methods 
also may be used in kits, said kits designed to carry out and facilitate the methods provided herein. 

M. Surface Chemistry 

[0028] Some embodiments of the invention provide methods of analyzing a sequence of a target 
polynucleotide by synthesizing a complementary strand, comprising: coating a surface of a substrate 
with a polyelectrolyte multilayer; permitting localization of said target polynucleotide on said surface 
of said substrate; providing a labeled nucleotide, said nucleotide coinprising a labeling moiety; 
allowing incorporation of said nucleotide into said complementary strand in the presence of a 
polymerizing agent; and detecting incorporation, thereby analyzing said sequence of said target 
polynucleotide. Methods also may be used in kits, said kits designed to carry out and facilitate the 
methods provided herein. 

[0029] Another aspect of the present invention provides a substrate comprising: a layer of 
polyanions; and a polynucleotide molecule anchored onto said layer of poryanions wherein said 
polynucleotide molecule is hindered from lying down on said layer. 
N. Flow Cell 

[0030] In another aspect, the invention provides apparatuses for carrying out the methods of the 
invention. Typically, apparatuses include: 

[0031] (a) a flow cell which houses a substrate for immobilizing target polynucleotide^) with 
single molecule resolution; 

[0032] (b) an inlet port and an outlet port in fluid communication with the flow cell for flowing 
fluids into and through the flow cell; 

[0033] (c) a light source for illuminating the surface of the substrate; and 

[0034] (d) a detection system for detecting a signal from said surface. 

[0035] In another aspect of the present invention, apparatuses for analyzing the sequence of a 

polynucleotides are provided. Some of the apparatus are microfabricated. In some of these 

embodiments, the substrate is a microfabricated synthesis channel. Thus, the apparatuses may 

include: 

[0036] (a) a flow cell with at least one micro-fabricated synthesis channel; and 
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[0037] (b) an inlet port and an outlet port which are in fluid communication with the flow cell and 
which flow fluids and reagents, such as deoxynucleoside triphosphates and polymerase into and 
through the flow cell. 

[0038] In some embodiments of the invention, a light source for illuminating the surface of said 
substrate and a detection system for detecting a signal from said surface are employed. Thus, some of 
the apparatuses additionally include: 

[0039] (c) a light source to direct light at a surface of the synthesis channel; and 
[0040] (d) a detector to detect a signal from the surface. 

[0041] Optionally, an appropriately programmed computer is also employed for recording identity 
of a nucleotide when the nucleotide becomes incorporated into the immobilized primer or template. 
[0042] In some embodiments, the synthesis channel is formed by bonding a microfluidic chip to a 
flat substrate. In some apparatuses, the microfluidic chip also contains micro-fabricated valves and 
micro-fabricated pumps in an integrated system with the synthesis channel. In some of these 
embodiments, a plurality of reservoirs for storing reaction reagents are also present, and the micro- 
fabricated valve and pump are connected to the reservoirs. In some embodiments, the detector is a 
photon counting camera. In some of the apparatuses, the microfluidic chip is fabricated with an 
elastomeric material such as RTV silicone. The substrate of some of the apparatuses is a glass cover 
slip. The cross section of the synthesis channel is some of the apparatuses has a linear dimension of 
less than about 100 um x 100 urn, less than about 10 urn x 100 urn, less than about 1 um x 10 urn, or 
less than about 0.1 um x 1 um. 

[0043] In a further aspect, the present invention provides methods for analyzing the sequence of a 
target polynucleotide using such apparatuses, including the steps of: 

[0044] (a) providing a primed target polynucleotide linked to a micro fabricated synthesis channel; 
[0045] (b) flowing a first nucleotide through the synthesis channel under conditions whereby the 
first nucleotide attaches to the primer, if a complementary nucleotide is present to serve as template in 
the target polynucleotide; 

[0046] (c) detennining presence or absence of a signal, the presence of a signal indicating that the 
first nucleotide was incorporated into the primer, and hence the identity of the complementary base 
that served as a template in the target polynucleotide; 
[0047] (d) removing or reducing die signal, if present; and 

[0048] (e) repeating steps (b) - (d) with a further nucleotide uiat is the same or different from the 
first nucleotide, whereby the further nucleotide attaches to the primer or a nucleotide previously 
incorporated into die primer. 

[0049] In some embodiments, step (a) comprises providing a plurality of different primed target 
polynucleotides linked to different synthesis channels; step (b) comprises flowing the first nucleotide 
through each of the synthesis channels; and step (c) comprises determining presence or absence of a 
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signal in each of the channels, the presence of a signal in a synthesis channel indicating the first 
nucleotide was incorporated into the primer in the synthesis channel, and hence the identity of the 
complementary base that served as a template in the target polynucleotide in the synthesis channel. In 
some embodiments, a plurality of different primed target polynucleotides are linked to each of the 
5 synthesis channels. 

[0050] Some embodiments include the further steps of flushing the synthesis channel to remove 
unincorporated nucleotides. In some methods, steps (b) - (d) are performed at least four times with 
four different types of nucleotides. In some methods, steps (b) - (d) are performed until the identity of 
each base in the target polynucleotide has been identified. In some of these embodiments, the 
10 removing or reducing step is performed by photobleaching. In some methods, all ingredients are 
present simultaneously, facilitating a continuous monitoring of the incorporation. 
O. Single Molecule, Single Base Extension 

[0051] Some embodiments of the invention provide methods for forming a spatially addressable 
array, which comprises determining the sequences of a plurality of polynucleotide molecules in an 
1 5 array that has a surface density such that a molecule in said array is in an optically resolvable area. 
Methods also may be used in kits, said kits designed to carry out and facilitate the methods provided 
herein. 

[0052] A further understanding of the nature and advantages of the present invention may be 
realized by reference to the remaining portions of the specification, the figures and claims. 
20 [0053] All publications, patents, and patent applications cited herein are hereby expressly 

incorporated by reference in their entirety and for all purposes to the same extent as if each was so 

individually denoted 

Brief Description of the Drawings 

[0054] Figure 1 shows schematically immobilization of a primed polynucleotide and incorporation 
25 of labeled nucleotides. Figure la is a schematic illustration and top field view of single molecule 
sequencing of a target polynucleotide; Figure lb shows a more detailed cartoon of the primed target 
polynucleotide. 

[0055] Figure 2 shows schematically the optical setup of a detection system for total internal 
reflection microscopy. 

30 [0056] Figure 3 shows results which indicate streptavidin can be used to immobilize a 
polynucleotide template in an exemplified embodiment 

[0057] Figure 4 shows results which indicate that DNA polymerase incorporating labeled 
nucleotide into the immobilized primer is visualized with single molecule resolution. 
[0058] Figure 5 shows incorporation of multiple labeled nucleotides in a bulk experiment in 
35 solution, using biotin-labeled 7G oligonucleotide template (SEQ ID NO:l) and p7G primer (SEQ ID 
NO:2). 
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Figure 6 shows low background signal from free nucleotides in solution and detection of 
ran incorporated nucleotides. 

Figure 7 shows results from experiments and simulation of multiple bleaching. 
Figure 8 shows dynamics of incorporation of labeled nucleotides into the immobilized 



[0059] 
signals 
[0060] 
[0061] 

primer. 

[0062] Figure 9 shows multiple incorporation events of labeled nucleotides over a period of time. 
[0063] Figure 10 shows statistics of incorporation of labeled nucleotides over a period of time. 
[0064] Figure 1 i shows correlation between location of labeled primer and location of 
incorporation of labeled nucleotides. 

[0065] Figure 12 shows correlation graphs for incorporation of two labeled nucleotides, using a 
6TA6GC oligonucleotide template (SEQ ID NO:6) and a p7G primer (SEQ ID NO:2). Partial 
sequences of the template, 5'-GccccccAtttttt-3' (SEQ ID NO:7), and the extended product, 5'- 
aaaaaaUggggggC (SEQ ID NO:8), are also shown in the Figure. 

[0066] Figure 13 shows detection of a single DNA molecule using fluorescence resonance energy 
transfer (FRET), when two different labels are incorporated into the same primer extension product 
The polynucleotide template used here is the 7G7A oligonucleotide (SEQ ID NO:5), but only part of 
the sequence, 5 AttctttGcttcttAttctttGcttcttAttctttG-3 ' (SEQ ID NO:9), is shown in the Figure. 
[0067] Figure 14 shows correlation of single molecule FRET signals over a period of time. 
[0068] Figure 1 5 shows the expected signals from an experiment in which two colors, donor and 
acceptor, are incorporated sequentially. Partial sequences of the template, 5'-GccccccAtttttt-3' (SEQ 
ID NO:7), and the extended product, 5'-aaaaaaUggggggC (SEQ ID NO:8), are also shown in the 
Figure. 

[0069] Figure 16 illustrates cm-surface incorporation into bound DNA being visualized at the 
single-DNA level. 

[0070] Figure 17 is a schematic illustration and top field view of the asynchronous nature of single 
molecule sequencing, where it does not matter if a base incorporates at some but not all copies of a 
given target polynucleotide. 

[0071] Figure 18 illustrates a principle behind asynchronous short-cycle sequencing, that is, 
obtaining incorporation in 99% of complementary strands requires a period of several half-lives of the 
incorporation reaction, where one half-life is the time taken for at least one incorporation to occur in 
50% of the complementary strands. On the other hand, shorter cycles leave a greater percentage of 



complementary strands un-extended 

[0072] Figure 19 illustrates the advantage of using short cycle sequencing with respect to avoiding 
long homopolymer reads. Figure 19a illustrates the homopolymer issue using non-short cycle 



sequencing in analyzing 10 target polynucleotides in a stimulated synthesis of their complementary 
strands using cycle periods of 10 half lives and repeating the cycles 12 times. Figure 19b illustrates a 
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short cycle embodiment for analyzing 10 target polynucleotides by simulating the synthesis of their 
complementary strands using short-cycle periods of 0.8 half life periods and repeating the cycles 60 
times. 

[0073] Figure 20 illustrates a short cycle embodiment for analyzing 200 target polynucleotides in a 
stimulated synthesis of their complementary strands using short-cycle periods of 0.8 half life periods 
and repeating the cycles 60 times. 

[0074] Figure 2 1 illustrates the statistics of incorporation, showing that polymerizing agent may 

incorporate repeat labeled nucleotides less readily than the first labeled nucleotide. 

[0075] Figure 22 illustrates a Monte Carlos simulation showing the effect of slowing down 

polymerizing agent and the lengthening of half lives on the cycle period for short cycle sequencing 

embodiments. 

[0076] Figure 23 illustrates the number of cycles needed with cycle periods of various half lives, 
taking into account slowdown factors of two (squares), five (triangles), and 10 (crosses), in order to 
obtain over 25 incorporations in over 80% of target hompolymers, with at least a 97% chance of 
incorporating two or less nucleotides per cycle (or a smaller than 3% chance of incorporating three or 
more nucleotides per cycle). 

[0077] Figure 24 illustrates one type of choking using Cy5-labeled nucleotides in consecutive 
incorporations. 

Detailed Description of the Invention 
L OVERVIEW 

[0078] The present invention provides methods and kits for analyzing one or more target 
polynucleotides with high sensitivity, parallelism, and long read frames. The analysis involves 
detecting incorporation of one or more nucleotides into the target's complementary strand in the 
presence of polymerizing agent, one or more types of nucleotides, and possibly other reaction 
reagents. 

[0079] In some embodiments, methods for analyzing the sequence of a single target polynucleotide 



by single base extension are provided. Such e 
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m 
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ents can detect incorporation of a single 



nucleotide molecule into the complementary strand of a single target polynucleotide molecule. Such 
single molecule, single base extension embodiments can read a single target molecule individually, 
even where multiple copies of the same or different targets are analyzed in parallel. 
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its, methods are applicable to sequencing by bulk single base extension, 
ents detect incorporation of nucleotides into a plural number of copies of a given target 



polynucleotide. That is, bulk single base extension emt 
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read multiple copies of the same 



target, even where mere are also multiple copies of different targets being analyzed in parallel. 



[0081] In some ei 
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ents of the present invention, the surface of a substrate is pretreated to 



create surface chemistry that facilitates polynucleotide attachment and subsequent sequence analysis. 
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In some of these embodiments, the substrate surface is coated with a polyelectrolyte multilayer 
(PEM). Biotin can be applied to the PEM, followed by application of streptavidin. The substrate 
surface can then be used to attach biotinylai 



iplates. The PEM-coated substrate may provide 
substantial advantages for immobilizing polynucleotides and for polymerization reactions. First, PEM 
can easily be terminated with polymers bearing carboxylic acids, facilitating polynucleotide 
attachment Second, the attached template is available for extension by polymerizing agents - most 
probably because repulsion of like charges between the negative carboxylic groups, for example, and 
the negative polynucleotide backbone hinders the template from "lying down" on the surface. 
Finally, the negative charges repel unincorporated nucleotides, reducing nonspecific binding and 
hence background interference. 

[0082] Certain embodiments involve immobilizing target polynucleotides on the surface of a 
substrate (eg., a glass or plastic slide, a nylon membrane, or gel matrix). The targets can be 
hybridized to a labeled primer (e.g., using a fluorescent dye) to form a target polynucleotide-primer 
complex, and their locations on the surface can be detected with single molecule sensitivity. In some 
aspects of the invention, single molecule resolution was achieved by anchoring the template 
molecules at low concentration to a surface of a substrate coated to create surface chemistry that 
facilitates template attachment and reduces background noise, and then imaging nucleotide 
incorporation, for example, with total internal reflection fluorescence microscopy. 
[0083] In certain embodiments, the signals of already-incorporated nucleotides are removed, 
reduced, and/or neutralized after one or more rounds of incorporation. This may be achieved, for 
example, by photobleaching fluorescent signals, by chemical means, such as chemically bleaching the 
labeling moiety, and/or chemically or photo-chemically cleaving all or a portion of the labeling 
moiety, and/or by enzymatically cleaving all or a portion of the labeling moiety from the nucleotide. 
In some embodiments, extinguishing the labeling is not necessary after every extension cycle, 
reducing the number of cycle steps. 

[0084] In certain embodiments, blocking moieties are used to hinder or halt the polymerization 
reaction. Removal of a portion or all of the blocking moiety reverses the inhibition, allowing chain 
elongation to resume. Such an approach makes it possible to read long runs of identical bases that 
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may not be quantifiable due to increasing signal intensity. Another approach to n 
stretches involves uses short cycle times, wherein only a limited number of nucleotides are allowed to 
incorporate in the growing complementary strands during each cycle. 

[0085] Certain embodiments use a labeling moiety that is sufficiently large to prevent or hinder 
further chain elongation by "choking" the polymerizing agent, thereby halting chain elongation 
without a 3 * blocking group. Subsequent removal of the labeling moiety, or at least the steric- 
hindering portion of the moiety, can concomitantly reverse chain termination and allow chain 
elongation to proceed. 
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[0086] Some embodiments use separate labeling and blocking moieties, but still allow single step 
reversal of chain termination and reduction of incorporated signals. In such embodiments, for 
example, chemically cleaving or photo-cleaving the blocking moiety may also chemically-bleach or 
photo-bleach the labeling moiety, respectively. 

[0087] In some other embodiments of the present invention, for example, in bulk single base 



extension e 









nil 



tents, only a small percentage of each type of nucleotides present in the extension 
reaction is labeled, e,g. y with fluorescent dye. As a result, relatively small numbers of incorporated 
nucleotides are fluorescently labeled, interference of energy transfer is minimized, and the 
polymerizing agent is less likely to fell off the template or be "choked" by incorporation of two 
labeled nucleotides sequentially. This may provide more efficient consumption of polymerizing 
agent In other embodiments, on the other hand, inefficient incorporation is desirable. For example, 
stopping or stalling incorporation by choking may be desired. Also, inefficient incorporation may 
lead to longer half lives for the slowed down incorporation, which is desirable in some short cycle 
sequencing embodiments. 

[0088] Analysis with single molecule resolution provides the advantage of monitoring the 
individual properties of different molecules. As each of the immobilized template molecules can be 
read individually, no synchronization is needed between the different molecules. Instead, with 
methods of the present invention, asynchronous base extension is sufficient for analyzing a target 
polynucleotide sequence. This allows identification of properties of an individual molecule that can 
not be revealed by bulk measurements in which a large number of molecules are measured together. 
For example, to determine kinetics, bulk measurements require synchronization, whereas in single 
molecule analysis there is no such need Further, asynchronous analysis allows for short cycles, that 
can facilitate analysis of homopolymer stretches, as mentioned above. 
[0089] The polynucleotides suitable for analysis with the invention can be DNA or RNA, as 
denned below. The analysis can provide sequence analysis, DNA fingerprinting, polymorphism 
identification, for example single nucleotide polymorphisms (SNP) detection, as well as methods for 
genetic cancer research. Applied to RNA sequences, the analysis can also identify alternate splice 
sites, enumerate copy number, measure gene expression, identify unknown RNA molecules present in 
cells at low copy number, annotate genomes by determining which sequences are actually transcribed, 
determine phylogenic relationships, elucidate differentiation of cells, and facilitate tissue engineering. 
Hie methods can also be used to analyze activities of other biomacromolecules such as RNA 
translation and protein assembly. Certain aspects of the present invention lead to more sensitive 
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[0090] In certain embodiments, the sequencing apparatuses comprise a microfabricated flow 
channel to which polynucleotide templates are attached Optionally, the apparatuses comprise a 
plurality of microfabricated channels, and diverse polynucleotide templates can be attached to each 
channeL The apparatuses can also have a plurality of reservoirs for storing various reaction reagents 
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and pumps and valves for controlling flow of the reagents. Hie flow cell can also have a window to 
allow optical interrogation. 

[0091] In some embodiments, single stranded polynucleotide templates with primers are 
immobilized to the surface of the raicrofabricated channel or to the surface of reaction chambers that 
5 are disposed along a microfebricated flow channel, eg., with streptavidin-biotin links. After 
immobilization of the templates, a polymerizing agent and one or more of the four nucleotide 
triphosphates are flowed into the flow cell, incubated with the template, and flowed out. If no signal 
is detected, the process is repeated with one or more different types of nucleotides. 
[0092] The use of microfebricated sequencing apparatuses can reduce reagent consumption. It also 

10 increases reagent exchange rate and the speed of sequence analysis. Indeed, using a microfluidic 
device, the rate at which the concentrations can be alternated can be as high as a few tens of Hertz. 
Additionally, the reduction of time and dead volume for exchanging reagents between different steps 
can also greatly reduce mismatch incorporation. Moreover, the read length can also be improved 
because there is less time for the polymerizing agent to incorporate a wrong nucleotide and it is less 

1 5 likely to fall off the template. All these advantages can result in high speed and high throughput 
sequence analysis regimes. 

[0093] Alternating concentrations of nucleotides can also improve signal visualization and 
polymerization rate in the static approach of sequence analysis. In this approach, after adding a given 
type of labeled nucleotide to the immobilized target polynucleotide-primer complex and allowing 

20 sufficient time for incorporation, free nucleotides (as well as other reaction reagents in solution) can 
be flushed out using a microfluidic device. This will leave a much lower concentration of free 
nucleotides when detecting incorporated signals. Optionally, an additional washing step can be 
employed to further reduce the free nucleotide concentration before detecting the signals. 
[0094] Further, in using a microfluidic device which allows fast fluid exchange, concentrations of 

25 nucleotides and/or other reaction reagents can be alternated at different time points of the analysis. 
This could lead to increased incorporation rates and sensitivity. For example, when all four types of 
nucleotides are simultaneously present in the reaction to monitor dynamic incorporation of 
nucleotides, concentrations of the nucleotides can be alternated between uM range and sub-nM range. 
This leads to both better visualization of the signals when low concentrations of nucleotides are 

30 present, and increased polymerization rate when higher concentrations of nucleotides are present 
[0095] Certain embodiments of the present invention avoid many of the problems observed with 
other sequencing methods. For example, the methods are highly parallel since many molecules can be 
analyzed simultaneously at high density (eg., one template molecule per -10 um 3 of surface area, as 
well as about 1 or 2 million per cm 2 ). Thus, many different polynucleotides can be sequenced or 

35 analyzed on a single substrate surface simultaneously. Hie microfebricated apparatuses facilitate this 
parallelization in mat many synthesis channels can be built on the same substrate, allowing analysis of 
a plurality of diverse polynucleotide sequences simultaneously. 
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IL Target Polynucleotide Preparation 

[0096] The present invention provides methods and kits for analyzing the sequence of a target 
polypeptide by detecting incorporation of a nucleotide into its complementary strand. Preparation for 
this analysis may include obtaining the target from a source and hybridizing it to a primer. 
5 A. Target Polynucleotide Sources 

[0097] The target polynucleotide is not critical and can come from a variety of standard sources. 
For example, nucleic acids can be naturally occurring DNA or RNA isolated from any source, 
recombinant molecules, cDNA, or synthetic analogs, as known in the art. For example, the target 
polynucleotides may be genomic DNA, genes, gene fragments, exons, introns, regulatory elements 

1 0 (such as promoters, enhancers, initiation and termination regions, expression regulatory factors, 
expression controls, and other control regions), DNA comprising one or more single-nucleotide 
polymorphisms (SNPs), allelic variants, and other mutations. Also included are the full genome of 
one or more cells, for example cells from different stages of diseases such as cancer. The target 
polynucleotide may also be mRNA, tRNA, rRNA, ribozymes, splice variants, antisense RNA, and 

15 RNAi. Also included are RNA with a recognition site for binding a polymerizing agent, transcripts of 
a single cell, organelle or microorganism, and all or portions of RNA complements of one or more 
cells, for example, cells from different stages of development or differentiation, and cells from 
different species. Polynucleotide can be obtained from any cell of a person, animal, plant, bacteria, or 
virus, including pathogenic microbes or other cellular organisms. 

20 [0098] Templates suitable for analysis according to the present invention can have various sizes. 
For example, the template can have a length of about 10 bases, about 20 bases, about 30 bases, about 
40 bases, about 50 bases, about 60 bases, about 70 bases, about 80 bases, 90 bases, 100 bases, about 
200 bases, about 500 bases, about 1 Id?, about 3 kb, about 10 kb, or about 20 kb and so on. 
]0099] When the target is from a biological source, a variety of known procedures may be used for 

25 extracting the polynucleotide and optionally amplifying to a concentration convenient for genotyping 
or sequence work. Recombinant or synthetic polynucleotides may also be amplified. Polynucleotide 
amplification methods are known in the art Preferably, the amplification is carried out by polymerase 
chain reaction (PCR). See, U.S. Pat Nos. 4,683,202. 4,683,195 and 4,889,818; Gyllenstein et al., 
1988, Proc. Nati. Acad. Sci. USA 85: 7652-7656; Ochman et al., 1988, Genetics 120: 621-623; Loh et 

30 al., 1989, Science 243: 217-220; Innis et al., 1990, PCR Protocols, Academic Press, Inc., San Diego, 
Calif. Other amplification methods known in the art that can be used in the present invention include 
ligase chain reaction (see EP 320,308), or methods disclosed in Kricka et al., 1995, Molecular 
Probing, Blotting, and Sequencing, Chap. 1 and Table DC, Academic Press, New York. 
[0100] In some applications, the polynucleotides to be analyzed are first cloned in single-stranded 

35 Ml 3 plasmid (see, eg. , Cur-rent Protocols In Molecular Biology, Ausubel, et al., eds., John Wiley & 
Sons, Inc. 1995; and Sambrook, et al., Molecular Cloning. A Laboratory Manual, Cold Spring Harbor 
Press, 1989). The single stranded plasmid can be primed by 5*-biotinylated primers (see, eg., U.S. 
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Patent No. 5,484,701), and double stranded plasmid can then be synthesized. The double stranded 
circle can then be linearized, and the biottnylated strand purified 
B. Primer Hybridization 

[0101] Analyzing a target polynucleotide by synthesizing its complementary strand may involve 

5 hybridizing an oligonucleotide primer to the target. The primer can be selected to be sufficiently long 
to prime the synthesis of extension products in the presence of a polymerizing agent Primer length 
can be selected to facilitate hybridization to a sufficiently complementary region of the template 
polynucleotide downstream of the region to be analyzed. The exact lengths of the primers depend on 
many factors, including temperature, source of primer and the use of the method. For example, 

1 0 primers may be at least about 10 bases in length, at least about 15, or at least about 30 bases in length. 
[0102] If part of the region downstream of the sequence to be analyzed is known, a specific primer 
can be constructed and hybridized to this region of the template. Alternatively, if sequences of the 
downstream region on the template polynucleotide are not known, universal or random primers may 
be used in random primer combinations. As another approach, oligonucleotide adaptors can be joined 

15 to the ends of target polynucleotide by a ligase and primers can be designed to bind to these adaptors. 
That is, an adaptor or linker can be ligated to target polynucleotides of unknown sequence to allow for 
primer hybridization. Alternatively, known sequences may be biotinylated and ligated to the targets. 
In yet another approach, nucleic acid may be digested with a restriction endonuclease, and primers 
designed to hybridize with the known restriction sites that define the ends of the fragments produced. 

20 [0103] The primers can be synthetically made using conventional nucleic acid synthesis 
technology. For example, primers can be conveniently synthesized on an automated DNA 
synthesizer, an Applied Biosystems, Inc. (Foster City, Calif.) model 392 or 394 DNA/RNA 
Synthesizer, using standard chemistries, such as phosphoramidite chemistry, ag. as disclosed in 
Beaucage and Iyer, Tetrahedron, 48: 2223-23 1 1 (1992), and the like. Alternative chemistries, e.g. 

25 resulting in non-natural backbone groups, such as phosphorofhioate, phosphoramidate, and the like, 
may also be employed provided that, for example, the resulting oligonucleotides are compatible with 
the polymerizing agent Hie primers can also be ordered commercially from a variety of companies 
which specialize in custom oligonucleotides such as Operon Inc (Alameda, Calif.). 
[0104] In some embodiments, the primer bears a labeling moiety. When hybridized to an anchored 

30 polynucleotide molecule, the labeling moiety facilitates locating the bound molecule through imaging. 
As exemplified in die Examples below, the primer can be labeled with a fluorescent labeling moiety 
(e.g., Cy5), or any other means used to label nucleotides. The labeling moiety used to label die primer 
can be different from die labeling moieties used on the nucleotides in the subsequent polymerization 
reactions. Correlation of the signal of the different types of labeling moieties can also facilitate 

35 locating bound molecules as well as locating bound molecules capable of acting as useful templates 
for complementary strand synthesis. 



WO 2005/080605 



PCT/US2005/004258 



10 



15 



20 



25 



30 



35 



-16- 



[0105] If the target polynucleotide-primer complex is to be anchored on a surface of a substrate, 
the primer can be hybridized before or after such anchoring. Primer annealing can be performed 
under conditions which are stringent enough to require sufficient sequence specificity, yet permissive 
enough to allow formation of stable hybrids at an acceptable rate. The temperature and time required 
for primer annealing depend upon several factors including base composition, length, and 
concentration of the primer; the nature of the solvent used, e.g. , the concentration of DMSO, 
fbrmamide, or glycerol; as well as the concentrations of counter ions, such as magnesium Typically, 
hybridization with synthetic polynucleotides is carried out at a temperature that is approximately 5 to 
approximately 10° C. below the melting temperature (Tm) of the target polynucleotide-primer 
complex in the annealing solvent. In some embodiments, the annealing temperature is in the range of 
about 55 to about 75°C and the primer concentration is approximately 0.2 \xM. Other conditions of 
primer annealing are provided in the Examples below. In certain embodiments, the annealing reaction 
can be complete within a few seconds, 
m. Surface Treatment and Polynucleotide Anchoring 
A. Treatment of Substrate Surface 

[0106] Hie surface chemistry created by methods described herein provides various advantages to 
carrying out the present invention. In some applications, for example, the surface of the substrate (or 
synthesis channel) is pretreated to create surface chemistry that facilitates high density polynucleotide 
attachment with single molecule resolution, where the polynucleotide n^blecules are available for 
subsequent synthesis reactions. Coating the substrate (e.g., a microchannel) surface with the PEM 
and other techniques described herein can be significant for analyzing polynucleotide sequences 
according to the p 



it invention. 

[0107] For example, certain embodiments of the present invention feature a substrate coated with 
at least one layer ofpolyanions to which a polynucleotide molecule is anchored, where the 
polynucleotide molecule is hindered form lying down on the layer. The electrostatic repulsion 
between the negatively-charged polynucleotide backbone and the negatively-charged anionic layer 
helps keep the polynucleotide molecule in a substantially upright position relative to the layer. In 
some embodiments, the surface is thus exposed to a negative layer and a polynucleotide molecule 
anchored thereto. 

[0108] In some embodiments, multiple layers of alternating positive and negative charges are used 
In the case of incompletely-charged surfaces, multiple-layer deposition tends to increase surface 
charge to a well-defined and stable level 

[0109] In some embodiments, for example, die surface is coated with a polyelectrolyte multilayer 
(PEM). In some methods, PEM based surface chemistry can be created prior to template or primer 
attachment Preferably, die substrate surface is coated with a polyelectrolyte multilayer (PEM). 
Attachment of templates and/or primers to PEM-coated surface can be accomplished by light-directed 
spatial attachment (see, eg., U.S. Pat Nos. 5,599,695, 5,831,070, and 5,959,837). Alternatively, die 
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templates and/or primers can be attached to PEM-coated surface entirely chemically. In some 
embodiments, non-PEM based surfece chemistry can be created prior to template and/or primer 
attachment. 

[0110] PEM formation has been described in Decher et al. (Thin Solid Films, 210:831-835, 1992). 
5 PEM formation proceeds by the sequential addition of polycations and polyanions, which are 

polymers with many positive or negative charges, respectively. Upon addition of a polycation to a 
negatively-charged surface, the polycation deposits on the surface, forming a thin polymer layer and 
reversing the surfece charge. Similarly, a polyanion deposited on a positively charged surfece forms a 
thin layer of polymer and leaves a negatively charged surfece. Alternating exposure to poly(+) and 
1 0 poly (-) generates a poly electrolyte multilayer structure with a surface charge determined by the last 
polyelectrolyte added. This can produce a strongjy-negatively-charged surfece, repelling the 
negatively-charged nucleotides and preventing lying down. 

[0111] In certain embodiments, for example, methods of preventing a substrate-anchored 
polynucleotide from lying down on the substrate involve exposing the substrate to a positive or 

1 5 negative polyelectrolyte, washing, exposing the substrate a polyelectrolyte of opposite charge from 
the one previously used, repeating the alternating layers any number of times, and terminating with a 
layer of negative polyelectrolyte. Each polyelectrolyte step can be, eg. about 10 minutes, and a wash 
step can be carried out by thorough rinsing with higfc purity water. The negative polyelectrolyte may 
be a polystyrene sulphonate polymer, a polyglutamic acid polymer and/or a polyacrylic acid polymer. 

20 The positive polyelectrolyte may be a poryiysine polymer, a polyethyieneirnine polymer and/or a jj 
poly(aUylamine) polymer. The number of alternating layers may be about two, about three, about 
four, about five, about six, and so on. 

[0112] Further, an upright orientation helps the anchored polynucleotide remain available for 
polymerizing reactions, serving as a useful template for a polymerizing agent synthesizing the 

25 complementary strand. That is, the attached template can be read by polymerizing agents - most 
probably because the repulsion of like charges hinders the template in lying down on the surface. 
Without being bound to any particular theory, the negative electrostatic shielding at the surface 
probably repels the unanchored end of the polynucleotide molecule away from the surface, reducing 
surface-promoted denaturation of the polymerizing agent and/or reducing steric hindrances that might 

30 inhibit polymerizing activity. 

[0113] Binding large quantities of polynucleotides may not be useful if the target polynucleotide- 
primer complex cannot be extended by a polymerizing agent For example, this problem may arise 
from surfece chemistry bearing amines, which are positively charged at normal pH. The negatively- 
charged polynucleotide backbone can non-specifically stick to such a surface, sterically impeding the 
35 polymerizing agent from adding nucleotides. Some embodiments avoid such problems by coating a 
substrate with a PEM and anchoring a polynucleotide molecule to it to allow nucleotide incorporation 
into its complementary strand in the presence of a polymerizing agent 
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[0114] An upright orientation also facilitates detection, e.g. detection of a fluorescent moiety 
incorporated into the growing complementary strand Detection is also facilitated by reduction of 
background signals. That is, in certain embodiments, surface chemistry reduces background by 
reducing non-specific attachment of free labeled nucleotides to (he surface of the substrate. For 
5 example, it can render nonspecific binding of fluorescently-labeled nucleotides very low, because 
negative charges of the terminal surface layer can repel negatively-charged free nucleotides bearing 
fluorescent moieties. In certain embodiments, the substrate bears a layer of polyanions sufficient to 
reduce nonspecific attachment of negative moieties by a factor of at least about 5, at least about 6, at 
least about 7, at least about 8, at least about 9, at least about 10, at least about 1 1 , at least about 12, at 

10 least about 13, at least about 14, and at least about 15 compared to an uncoated surface of the 

substrate. This can achieve low density of non-specifically attached nucleotide molecules. Further, 
the polymeric nature of the PEM can result in increased charge density for each depositing layer, 
facilitating fine tuning of the charge density and covering any inhomogeneities on the surface that 
may become sites for non-specific attachment. 

15 [01 15] If there is significant nonspecific binding of nucleotides bearing fluorescent moieties to the 
surface, it may become impossible to distinguish between signal due to incorporation and signal due 
to nonspecific binding. Fluorescently-labeled nucleotides generally exhibit relatively strong 
nonspecific binding to many surfaces, because they can possess both a strongly polar moiety (the 
nucleotide, and in particular any triphosphate) and a relatively-hydrophobic moiety (eg., the 

■ 

20 fluorescent dye). A surface bearing positively-charged groups (e.g. , amines) can promote very high 
nonspecific binding due to the attraction of the negatively charged nucleotides to the positively- 
charged surface groups, e.g., amines. Neutral surfaces generally also exhibit strong nonspecific 
binding due to the fluorescently-labelled nucleotide acting as a surfactant (ie. assembling with its 
nonpolar moiety directed towards the uncharged (more hydrophobic) surface and its polar end 

25 directed towards the aqueous phase. Glass is a negatively-charged surface in water, but the surface 
silanols that create teh negative charge are a difficult target for directly attaching polynucleotides. 
Typical attachment protocols use silanizaiion (often with aminosilanes); however, as discussed above, 
amino groups can lead to unacceptable levels of nonspecific binding. Using the surface chemistries 
described herein, however, can facilitate methods of detecting synthesis of a single polynucleotide 

30 molecule, for example, by coating a substrate with a PEM, anchoring the polynucleotide molecule to 
the PEM at single molecule resolution, and detecting incorporation of a nucleotide bearing a labeling 
moiety. 

[0116] In certain embodiments, the polynucleotide molecule that serves as a template for 
polymerization is selected to be of a certain length and anchored to the surface of the substrate. 
35 Longer length templates further facilitate detection of incorporated fluorescently-labeled nucleotides, 
as the incorporated fluorescent moieties are held away from the surface. For example, using a 
polynucleotide template of a certain length attached to a surface bearing a negatively-charged layer, a 
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single molecule of a fluorescently-labeled nucleotide can be detected when it becomes hybridized to 
the template or incorporated into its (complementary strand The single molecule can be detected over 
background fluorescence from unincorporated fluorescently-labeled nucleotide molecules. The 
polynucleotide template used may be at least about 30 nucleotide residues, at least about 40 
nucleotide residues, at least about 50 nucleotide residues, at least about 60 nucleotide residues, at least 
about 70 nucleotide residues, at least about 80 nucleotide residues, and at least about 90 nucleotide 
residues. The polynucleotide template used may be covalently or non-covalently attached to the 
surface, e.g. by biotm-steptavidin coupling. 

[0117] Figure 16 illustrates on-surface incorporation in anchored DNA being visualized at the 
single-DNA level. Figure 16a illustrates points of incorporation of fluorescently-labeled nucleotide in 
the presence of a DNA polymerase. Figure 16b illustrates the result where no polymerizing agent is 
present and Figure 16c illustrates the result where both fluorescently-labeled nucleotides and 
polymerizing agent are withheld. Comparison of Figure 16a with the controls 16b-c indicates that 
over 95% of the observed objects in Figure 16a represent single molecules of DNA. 
[01 18] Where more than one nucleotide of the same base-type becomes incorporated into the 
growing complementary strand, the number of nucleotides incorporated may by determined by 
quantifying the intensity of signal from labeling moieties on the incorporated nucleotides. Reduction 
of background signal, e.g., from unincorporated fluorescently-labeled nucleotides, also facilitates this 
quantification. For example, a polynucleotide template of a certain length can be attached to a surface 

4 
I 

bearing a negatively-charged layer and fluorescence from a number of bound fluorescently-labeled 
nucleotides measured over background interference from unbound fluorescently-labeled nucleotides, 
so that the measurement quantifies the number of bound nucleotide residues. Such embodiments can 
allow quantification of a number of repeat bases, that is, consecutive nucleotide residues each bearing 
the same base-type, e.g. in a homopolymer stretch. The number of repeat bases may be about two, 
about three, about 4, about 5, about 6, about 7, and about 8. As mentioned before, the polynucleotide 
template used may be at least about 30 nucleotide residues, at least about 40 nucleotide residues, at 
least about 50 nucleotide residues, at least about 60 nucleotide residues, at least about 70 nucleotide 
residues, at least about 80 nucleotide residues, and at least about 90 nucleotide residues; and may be 
covalently or non-covalently attached to the surface, e.g. by biotm-steptavidin coupling. 
[0119] Surface chemistries of the present invention can also facilitate anchoring reasonable 
quantities of polynucleotide at high surface density. The terminal negative layer may bear groups that 
facilitate attachment of polynucleotide molecules, for example by covalent linkage between the group 
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paiis. Such treatment allows a high density of polynucleotide coverage witti sin^e molecule 
resolution as described in more detail below. 

[0120] In certain embodiments, surface chemistries of the present invention can be used to create 
an array comprising a substrate, a PEM coating the substrate, and polynucleotide molecules anchored 
5 to the substrate at a density allowing visualization of the individual polynucleotide molecules. If 
insufficient numbers of template molecules were to be bound, the signal-to-noise ratio might become 
too low to allow useful sequencing. In some embodiments, the polynucleotide molecules are at a 
density of at least about 0.1, at least about 0.2, at least about 0.3, at least about 0.4, at least about 0.5, 
at least about 0.6, at least about 0.7, at least about 0.8, at least about 0.9, and at least about 1 
1 0 polynucleotide molecule per urn 2 . 

[0121] Detailed procedures for coating a substrate witii PEM for immobilizing polynucleotide are 
described in the Examples below. Briefly, the surface of the substrate (e.g. , a glass cover slip) can be 
cleaned with a RCA solution. After cleaning, the substrate can be coated with a polyelectrolyte 
multilayer (PEM), terminating with caiboxylic acid groups. Following biotinylation of the caiboxylic 

1 5 acid groups, streptavidin can be applied to generate a surface capable of capturing biotinylated 

molecules. Biotinylated polynucleotide templates or primers can then be added to the coated substrate 
for anchoring. During the anchoring step, a high concentration of cations, e.g. , Mg 2 *, can be used to 
screen the electrostatic repulsion between the negatively-charged polynucleotides and the negatively- 
charged PEM surface. In subsequent steps, the cation concentration can be reduced to re-activate 

20 repulsive shielding. By titrating biotinylated polynucleotide molecules, it is possible to bind such a 
small number of molecules to the surface that (hey are separated by more than the diffraction limit of 
optical instruments and thus able to be visualized individually. 

[0122] The attachment scheme described here can be readily generalized. Without modification, 
the PEMtoiotin/streptavidin surface produced can be used to capture or immobilize any biotinylated 

25 molecule. A slight modification can be the use of another capture pair, for example, substituting 
digoxygenin (dig) for biotin and labeling the molecule to be anchored with anti-digpxygenin (anti- 
dig). Reagents for biotinylation or dig-labeling of amines are both commercially available. 
[0123] The feet that the chemistry is nearly independent of the surface chemistry of the support 
permits further generalization. Glass, for instance, can support PEMs terminated with either positive 

30 or negative polymers, and a wide variety of chemistry is available for either. But other substrates 
such as silicone, polystyrene, polycarbonate, etc, or even membranes and/or gels, which are not as 
strongly charged as glass, can still support PEMs. The charge of the final layer of PEMs on weakly- 
charged surfaces becomes as high as that of PEMs on strongly-charged surfaces, as long as the PEM 
has sufficiently-many layers. For example, PEM formation on (fe-plasma treated silicone rubber has 

35 been demonstrated by the present inventors. Thus, advantages of the 

gJass/PEM/biotin/Strepta\ddin/biotin surface chemistry can readily be applied to other 

substrates. 
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[0124] In microfluidic embodiments, the attachment schemes can be either ex-situ or m-situ* With 
the ex-situ protocol, for example, the surface of die substrate is coated with PEM first, followed by 
template/primer attachment An elastomeric microfluidic chip is then bonded to the substrate to form 
and seal the synthesis channel. With die in-situ protocol, on the other hand, the microfluidic chip is 
5 attached to the flat substrate first, and a PEM is then constructed within the channels. The 

templates/primers are then attached inside the channels. In still other embodiments, the microfluidic 
chip can be bonded to the substrate at any point in the template/primer attachment process, and the 
remaining steps can be completed inside the microfluidic channels. 

[0125] Certain embodiment described herein lead to good seal of the microfluidic components and 

10 the synthesis channels. A good seal between the microfluidic components and the synthesis channels 
allows the use of higher pressures, which in turn increases flow rates and decreases exchange times. 
[0126] Although the above discussion describes the immobilization of polynucleotide templates or 
primers by attachment to the surface of flow channels (or the surface of reaction chambers disposed 
along flow channels), other methods of template immobilization can also be employed in certain 

1 5 embodiments of the present invention. In some embodiments, for example, the templates or primers 
can be attached to microbeads, which can be arranged within the microfluidic system. For instance, 
commercially-available latex microspheres with pre-defined surface chemistry can be used The 
polynucleotide templates or primers can be attached either before or after the microbeads are inducted 
into the microfluidic system. Attachment of template or primer before beads are added may allow a 

20 reduction in system complexity and setup time (as many templates or primers can be attache^ to 
different aliquots of beads simultaneously). Attachment of template or primer to beads in situ can 
allow easier manipulation of surface chemistry (as bead surface chemistry can be manipulated in bulk, 
externally to the microfluidic device). Beads can be held in place within the flow system, for 
example, by flowing the beads into orifices too small for them to flow through (where they become 

25 "wedged in"), creating "micro-screens" (Le. barriers in the channel with apertures too small for beads 
to pass through), and inserting the beads into hollows in the channels where they are affixed by simple 
Van der Waals forces. 
B. Polynucleotide Anchoring 

[0127] In some embodiments, the template or target polynucleotide molecules are provided as 
30 single molecule arrays anchored to the surface of a substrate. The substrate can be a solid support 

(&£., glass, silica, or plastic), a semi-solid support (eg., a gel or other matrix), and/or a porous support 
(e.g., a nylon membrane or other membrane) or any other conventionally non-reactive material. In 
some embodiments, the substrate is selected to not create significant noise or background for 
fluorescent detection methods. The substrate surface to which targe polynucleotides are to be 
35 . anchored can also be the internal surface of a flow cell in a microfluidic apparatus, eg. , a 

imcrofabricated synthesis channel. By anchoring the templates, unincorporated nucleotides can be 
removed from the synthesis channels by a washing step, m some embodiments, the substrate is made 
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from fused silica slide (eg. , a fused silica glass slide from Esco, Cat Rl 301 10). Compared to some 
other substrate materials (e.g., a regular glass slide), fused silica has very low auto-fluorescence, that 



may be desirable in certain embodiments. 

[0128] In some applications of title present invention, the polynucleotides are anchored or 



nrnim 



>bilized to the substrate surface with single molecule resolution. In such methods, as exemplified 
in the Examples below, singje molecule resolution is achieved by using very low concentration of the 
polynucleotide in the immobilization reaction. For example, a 10 pM concentration for a 80-mer 
polynucleotide template allows attachment of the polynucleotide to the surface of a silica slide at 
single molecule resolution (see Example 1). Template immobilization with single molecule resolution 
can also be verified by measuring bleach pattern of fluorescently-labeled templates (see Example 5). 
[0129] In some embodiments, the target polynucleotides are immobilized to the surface prior to 
hybridization to the primer. In certain embodiments, the target polynucleotides are hybridized to the 
primers first and men immobilized on the surface. In still some embodiments, the primers are 
immobilized to the surface, and the target polynucleotides are attached to the substrates through 
hybridization with the primers. In some embodiments, the primer is hybridized to target 
polynucleotide prior to providing nucleotides for the polymerization reaction. In some, the primer is 
hybridized to the target polynucleotide while the nucleotides are being provided. In still some 
embodiments, the polymerizing agent is immobilized to the surface. 

[0130] Various methods can be used to anchor or immobilize die target polynucleotides or the 
primers to the surface of the substrate, such as, the surface of the synthesis channels or reaction 
chambers. The immobilization can be achieved through direct or indirect bonding to the surface. The 
bonding can be by covalent linkage. See, Joos et al., Analytical Biochemistry 247:96-101, 1997; 
Oroskar etal., Clin. Chem 42:1547-1555, 1996; and Khandjian, Mole. Bio. Rep. 11:107-115, 1986. 
The bonding can also be through non-covalent linkage. For example, Biotin-streptavidin (Taylor et 
al., J. Phys. D. Appl. Phys. 24:1443, 1991) and digoxigenin with anti-digoxigenin (Smith et al., 
Science 253: 1122, 1992) are common tools for anchoring polynucleotides to surfaces and parallels. 
Alternatively, the attachment can be achieved by anchoring a hydrophobic chain into a lipidic 
monolayer or bilayer. Other methods for attaching nucleic acids to supports can also be used. 
[0131] When biotin-streptavidin linkage is used to anchor the polynucleotides, the polynucleotides 
can be biotinylated, while one surface of the substrates (e.g, one surface of the synthesis channels) 
can be coated with streptavidin. Since streptavidin is a tetramer, it has four biotin binding sites per 
molecule. Thus, it can provide linkage between the surface and the polynucleotide. In order to coat a 





a 
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streptavidin can be used to anchor the protein to the surface, leaving the other sites free to bind the 
biotinylated polynucleotide (see, Taylor et aL, J. Phys. D. Appl Phys. 24:1443, 1991). Such treatment 
leads to a high density of streptavidin on the surface of the substrate (e.g. die synthesis channel), 
allowing a correspondingly high density of template coverage. Surface density of the polynucleotide 
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molecules can be controlled by adjusting die concentration of the polynucleotide applied to the 
surface. Reagents for biotinylating a surface can be obtained, for example, from Vector laboratories. 
Alternatively, biotinylation can be performed with BLCPA: EZ-Link Biotin LC-PEO-Amine (Pierce, 
Cat 21347), or any other known or convenient method 
5 [0132] In some embodiment, labeled streptavidin (e.g. , streptavidin bearing a labeling moiety such 
as a fluorescent label) of very low concentration (e.g.> in the pM, nM or pM range) is used to coat the 
substrate surface prior to anchoring. This can facilitate immobilization of the polynucleotide with 
single molecule resolutioa It also can allow detecting spots on the substrate to determine where the 
polynucleotide molecules are attached, and to monitor subsequent nucleotide incorporation events. 

1 0 [0133] While diverse polynucleotide templates can be each immobilized to and sequenced in a 
separate substrate or in a separate synthesis channel, multiple templates can also be analyzed on a 
single substrate (e.g. in a single microfluidic synthesis channel). In the latter scenario, the templates 
can be bound to different locations on the substrate (e.g. at different locations along the flow path of 
the channel). This can be accomplished by a variety of different methods, including hybridization of 

1 5 primer capture sequences to oligonucleotides immobilized at different points on the substrate (e.g. the 
channel), and sequential activation of different points down the substrate (e.g. the channel) towards 
template immobilization. 

[0134] Methods of creating surfaces with arrays of oligonucleotides have been described, e.g. , in 
U.S. Pat Nos. 5,744,305, 5,837,832, and 6,077,674. In certain embodiments, such surfeces can be 

20 used as a substrate to be bonded to a microfluidic chip to form a synthesis channel. Primers with two 
domains, a priming domain and a capture domain, can be used to anchor polynucleotide targets to the 
substrate. The priming domain is complementary to a region of the target polynucleotide. The 
capture domain is present on the non-extended side of the priming sequence. It is not complementary 
to the target template, but rather to a specific oligonucleotide sequence present on the substrate. The 

25 target polynucleotide can be separately hybridized with their primers, or (if the priming sequences are 
different) hybridized together in the same solution. Incubation of the target polynucleotide-primer 
complexes with the substrate (e.g. , in die flow channel) under hybridization conditions allows 
attachment of each to a unique spot Multiple substrates (e.g. t multiple synthesis channels) can be 
charged with polynucleotides in this fashion simultaneously. 

30 [0135] Another method for attaching multiple polynucleotides to die surface of a single substrate 
(e.g. in a single channel) is to sequentially activate portions of the substrate and attach template to 
them Activation of the substrate can be achieved by either optical or electrical means. Optical 
illumination can be used to initiate a photochemical deprotection reaction that allows attachment of 
the polynucleotide molecule to the surface (see, eg;, U.S. Pat Nos. 5,599,695, 5,831,070, and 

35 5,959,837). For instance, the substrate surface can be derivitized with "caged biotin", a commercially 
available derivative of biotin that becomes capable of binding to avidrn only after being exposed to 
light Polynucleotides can then be attached by exposure of a site to tight, filling the channel with 
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avidin solution, washing, and then flowing biotinylated template into the channel Another variation 
is to prepare avidinylated substrate and a target polynucleotide with a primer with a caged biotin 
moiety, the target polynucleotide can men be anchored by flowing into the channel, while 
illuminating the solution above a desired area Activated target polynucleotide-primer complexes are 
5 then attached to the first wall they diffuse to, yielding a diffusion-limited spot 

[0136] Electrical means can also be used to direct polynucleotide moleucles to specific locations 
on a substrate or in a channel. By positively charging one electrode in the channel and negatively 
charging the others, a field gradient can be created which drives the polynucleotide molecule to a 
single electrode, where it can attach (see, e.g., U.S. Pat. Nos. 5,632,957, 6,051,380, and 6,071,394). 

10 Alternatively, it can be achieved by electrochernically activating regions of the surface and changing 
the voltage applied to the electrodes. Patterning of particular chemicals, include proteins and 
polynucleotides is possible with a stamp method, in which a rrucrofabricated plastic stamp is pressed 
on the surface (see, e.g., Lopez et al., J. Amer. Chem. Soc. 1 15:10774-81, 1993). 
[0137] In certain embodiments, different polynucleotides can also be attached to the surface 

15 randomly as the reading of each individual molecule may be analyzed independently from the others. 
Any other known methods for attaching polynucleotides and/or proteins may be used. 

IV- COMPLEMENTARY STRAND SYNTHESIS 

[0138] After preparing the target polypeptide and possibly anchoring it on the surface of a 
substrate, primer extension reactions can be performed (e.g. , as described in Sambrook, supra; 
20 Ausubel, supra; and Hyman, Anal. Biochem., 174, p. 423, 1988) to analyze the target polynucleotide 
sequence by synthesizing its complementary strand. In some embodiments, the primer is extended by 
a polymerizing agent in the presence of a single type of nucleotide bearing a labeling moiety. In other 
embodiments, all four types of nucleotides are present, each bearing a detectably distinguishable 
labeling moiety. In some applications of the present invention, a combination of labeled and non- 
25 labeled nucleotides are used in the analysis. 

[0139] A labeling moiety can be incorporated into the target polynucleotide-primer complex when 
the specific nucleotide bearing the labeling moiety is complementary to the nucleotide on the template 
adjacent to the 3' end of the primer. Optionally, the target polynucleotide-primer complex is 
subsequently washed to remove unincorporated labeling moieties, and the presence of any 
30 incorporated labeling moiety is detected. Reaction conditions and incubation times may be chosen to 
reduce polymerization errors. 
A. PolymerfaJng Agents 

[0140] Various polymerizing agents can be selected for use in this invention. For example, 
depending on the template, a DNA polymerase, an RNA polymerase, or a reverse transcriptase can be 
35 used in the primer extension reactions. For analysis of DNA templates, many DNA polymerases are 
available. Examples include, but are not limited to, E. coli DNA polymerase, Sequenase 2.0®, T4 
DNA polymerase or the Klenow fragment of DNA polymerase 1 , T3, AMY, M-MLV, and/or Vent 
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activity, polymerases lacking 3'— *5' exonuclease activity would not be used. 
[0141] Rather than thermodegradable polymerizing agents, in some < 
polymerases are used, such as UiermoSequenase™ (Amersham) or Taquenase™ (ScienTech, St 
Louis, Mo.). Further examples include other thermostable polymerases isolated from Thermus 
aquaticus, Thermus thermophilus, Pyrococcus woesei, Pyrococcus juriosus, Thermococcus litoralis, 
and Thermotoga maritima. 

[0142] The polymerizing agent can have a fidelity (incorporation accuracy) of at least about 99% 
and a processivity (number of nucleotides incorporated before the enzyme moiety dissociates from the 
template) of at least about 20 nucleotides. Examples include T7 DNA polymerase, T7 DNA 
polymerase cornplexed with T7 helicase/primase, T5 DNA polymerase, HIV reverse transcriptase, E. 
coll DNA pol I, T4 DNA polymerase, T7 UNA polymerase, Taq DNA polymerase, E. coli RNA 
polymerase, and Phi29 DNA polymerase. 

[0143] Nucleotides can be selected to be compatible with the polymerizing agent to be used. 
Procedures for selecting suitable nucleotide and polymerase combinations can be adapted from Ruth 
et al. (1981) Molecular Pharmacology 20:415-422; Kutateladze, T., et al. (1984) Nuc. Acids Res., 
12:1671-1686; Chidgeavadze, Z., et al. (1985) FEBS Letters, 183:275-278. In certain embodiments, 
the polymerizing agent is able to tolerate labeling moieties, quenching moieties, and/or chain 
elongation inhibiting moieties on the nucleotide, including the base, sugar and/or phosphate groups. 
For example, some applications of the present invention employ polymerizing agents that have 
increased ability to incorporate modified, fluorophore-labeled nucleotides into a growing 



comple 



ry strand. Examples of such polymerizing agents have been described in U.S. Pat No. 
5,945,312, e.g., mutant bacteriophage T4 DNA polymerases, as well as mutant T2^T4, or T6 DNA 
polymerase including, but not limited to,L412M-DNA polymerase, Q380K-DNA polymerase, 
E395K-DNA polymerase, E743K-DNA polymerase, M725I-DNA polymerase, M725V-DNA 
polymerase, S756P-DNA polymerase, L771F DNA polymerase, L771H-DNA polymerase, -DNA 
polymerase, -DNA polymerase, V355 A-DNA polymerase, E395K+L412M-DNA polymerase, 
L412M+E473K-DNA polymerase, E395K+L41 2M+E743K-DNA polymerase, and 
Q380K+L412M+B743K-DNA polymerase. 

: complex anchored on a surface of a 



[0144] In embodiments using target polynucleotide-] 
substrate, the polymerizing agent can be stored in a separate reservoir and flowed onto the substrates 
(e.g. t into a flow chamber/synthesis channel/cell which houses the substrate) prior to each extension 
reaction cycle. Hie polymerizing agent also can be stored together with the other reaction reagents 
(e.g. , flie nucleotide triphosphates). Alternatively, the polymerizing agent can be immobilized 
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the surface of the substrate (eg. , the surface of the synthesis channel) along with the target 
polynucleotide-primer complex, or while the target polynucleotide is added in solution. 
B. Labeling Moieties 

[0145] In certain embodiments, to facilitate detection of nucleotide incorporation, at least one and 
up to all types of the nucleotides (e.g., dATP, dTTP, dGTP, dCTP, and/or ATP, UTP, GTP, and CTP) 
bear a labeling moiety. Various labeling moieties which are easily detected include radioactive labels, 
optically-detectable labels, spectroscopic labels and the like, hi certain embodiments, fluorescent 
labeling moieties are used. When more man one type of nucleotide bears a labeling moiety, a 
different kind of labeling moiety can be used to label each different type of nucleotide. However, in 
some applications, the different types of nucleotides can be labeled with the same kind of labeling 
moieties. 

[0146] Various fluorescent labeling moieties can be used to label the nucleotides in the present 



invention. The fluorescent labeling moiety can be selected from any of a number of different 
moieties. In some embodiments a fluorescent group for which detection is quite sensitive is selected. 
For example, fluorescein- or rhodarnine-labeled nucleotides may be selected and are available 
commercially. 

[0147] Fluorescent moieties having a high quantum yield and a large extinction coefficient may be 
also be chosen to facilitate detection. Fluorescent moieties with a large Stokes shift (£&, the 
difference between the wavelength of maximum absorb ance and the wavelength of maximum 
emission) may also be selected so that the fluorescent emission is readily distinguished from the 
excitation source used. Further, certain visible and near IR fluorescent moieties are sufficiently 
fluorescent and photostable to be detected as single molecules. For example, single molecules of 
BODIPY R6G (525/545), LI-CORfc, and IRD-38 (780/810) can be detected can be use in the practice 



of certain ei 



niAJdiments of the present invention. Fluorescent labels exhibiting particularly high 
coefficients of destruction can also be useful in destroying nonspecific background signals. 
[0148] The affinity for the surface can vary for different fluorescent dyes. For example, Cy3 and 
Cy5 are used to label the primer or nucleotides in some embodiments of the invention. However, Cy5 
has higher affinity to the surface under certain experimental conditions than Cy3, making Cy3 (the 
lower affinity dye) more suitable in certain embodiments. 

[0149] Another factor that may be considered is the stability of different fluorescent dyes. For 
example, Cy5 is less stable and tends to bleach faster than Cy3. This can be an advantage or 
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in 



efficiency of incorporation of the nucleotides bearing them, m some embodiments, inefficient 



incorporation due to choking, for example, is desirable. In some emdiments, inefficient incorporation 
may also be desirable to lengthen the half life of incorporation reactions, facilitating short cycle 
sequencing approaches. Further, the length of the linker between the labeling moiety and the 
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micleotide can impact efficiency of the incorporation (see, Zhu and Waggoner, Cytometry 28: 206, 
1997). 

[0150] An exemplary list of fluorophores, with their corresponding absorption/emission 
wavelength indicated in parenthesis, which can be used in the present invention include Cy3 

5 (550/565), Cy5 (650/664), Cy7 (750/770), Rhol23 (507/529), R6G (528/551), BODIPY 576/589 
(576/589), BODIPY TR (588/616), Nile Blue (627/660), BODIPY 650/665 (650/665), Sulfo-IRD700 
(680/705), NN382 (778/806), Alexa488 (490/520), Tett^emylmodamine (550/570). andRodamine 
X (575/605). In instances where a multi-labeling scheme is utilized, a wavelength which 
approximates the mean of the absorption maxima various labeling moieties may be used 

1 0 Alternatively, multiple excitations may be performed, each using a wavelength corresponding to the 
absorption maximum of a specific labeling moiety. 

[0151] Certain fluorescently-labeled nucleotides can be obtained commercially (e.g., from Peririn 
Elmer, Amersham, or BDL). Alternatively, fluorescently-labeled nucleotides can also be produced by 
various fluorescence-labeling techniques, eg., as described in Kambara et al. (1988) "Optimization of 

1 5 Parameters in a DNA Sequenator Using Fluorescence Detection," Bio/Technol. 6:81 6-821 ; Smith et 
al. (1985) Nucl. Acids Res, 13:2399-2412; and Smith et al. (1986) Nature 321:674-679. Acyl fluoride 
of Cy5 cyanine dye can also be synthesized: and labeled as described in U.S. Pat No. 6,342,326. 
Other examples of nucleotides bearing fluorescent labeling moieties mat may be used in certain 
embodiments include dATP-lissaraine; dCTP-Cy3, dATP-Tetramethylrhodamine, and dATP-Texas 

20 Red 

4 

[0152] There is a great deal of practical guidance available in the literature providing an exhaustive 
list of fluorescent molecules and their relevant optical properties. See, for example, Berlman, 
Handbook of Fluorescence Spectra of Aromatic Molecules, 2nd Edition (Academic Press, New York, 
1971); Griffiths, Colour and Constitution of Organic Molecules (Academic Press, New York, 1976); 

25 Bishop, Ed, Indicators (Pergamon Press, Oxford, 1972); Haugland, Handbook of Fluorescent Probes 
and Research Chemicals (Molecular Probes, Eugene, 1992) Pringsheirn, Fluorescence and 
Phosphorescence (Interscience Publishers, New York, 1949); and the like. Further, there is extensive 
guidance in the literature for derivatizing fluorophore molecules for covaient attachment via common 
reactive groups that can be added to a nucleotide, as exemplified by the following references: 

30 Haugland (supra); UBman et aL, U.S. Pat. No. 3,996,345; Khanna et al, U.S. Pat No. 4,35 1 ,760. 
[0153] Further, mere are many Unking moieties and methodologies for attaching fluorophore 
moieties to nucleotides, as exemplified by the following references: Eckstein, editor, Oligonucleotides 
and Analogues: A Practical Approach (IRL Press, Oxford, 1991); Zuckerman et al., Nucleic Acids 
Research, 15: 5305-5321 (1987) (3' thiol group on oligonucleotide); Sharma et al., Nucleic Acids 

35 Research, 19: 3019 (1991) (3' sulfhydryl); Giusti et al., PCR Methods and AppUcations, 2: 223-227 
(1993) and Fung et al., U.S. Pat No. 4,757,141 (5' phosphoamino group via Ammolink™. U available 
from Applied Biosystems, Foster City, Calif.) Stabinsky, U.S. Pat No. 4,739,044 (3' 
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aminoalkylphosphoryl group); Agrawal et aL, Tetrahedron Letters, 31: 1543-1546 (1990) (attachment 
via phosphoranudate linkages); Sproat et aL, Nucleic Acids Research, 15: 4837 (1987) (5' mercapto 
group); Nelson et aL, Nucleic Acids Research, 17: 7187-7194 (1989) (3' amino group); and the like. 
3- or y-Quenching Moieties 



5 [0154] Some e: 









fill 



ents of the present invention use labeling moieties that become detectable 



upon incorporation of nucleotide into the complementary strand. In certain embodiments, for 
example, the nucleotides used comprise a fluorescent moiety on any position, as well as a quenching 
moiety on any or all of the phosphates of a nucleotide that are removed as the nucleotide incorporates 
into a polynucleotide molecule. For example, the quenching moiety may be on the P-phosphate of a 
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well as, it may be on the 5-phosphate of a nucleotide tetraphosphate and/or on the e-phosphate of a 
nucleotide pentaphosphate. 

[0155] The quenching moiety hinders fluorescence of free nucleotides, due to the proximity of the 
quenching and fluorescent moieties on a given nucleotide molecule. However, incorporation of a 
nucleotide di-, tri-, tetra-, or pentaphosphate released the non-a phosphates, whereupon the quenching 
moiety is also released, separating the fluorescent-quenching pair. For example, incorporation of a 
nucleotide triphosphate into a growing strand releases the P- and -phosphates (as pyrophosphate). 
Consequently, upon incorporation and/or removal of the released phosphates, e.g., pyrophophate, 
fluorescence from the labeling moiety increases, allowing detection of incorporated nucleotide. 
[0156] Any fluorescent-quenching pair can be used, where the fluorescent moiety attaches at any 
position on the nucleotide base, sugar, and/or a-phosphate, the quenching moiety is sufficiently 
proximal to the fluorescent moiety to inhibit its fluorescence, and the nucleotide bearing the 
fluorescent and quenching moieties remains capable of base-complementary incorporation by a 
polymerizing agent into a growing complementary strand. Nucleotide triphosphates having a 
quenching moiety attached to the y-phosphate are of interest, as substitutions at this position are 
known to still allow recognition by polymerizing agents. See, e,g. Felicia et aL, Arch, Biochem 
Bophys.> 246: 564-571 (1986). 

[0157] In certain embodiments, the fluorescent and/or quenching moiety are derivatized for 
attachment to the nucleotide either directly or via a linker. There are many linking moieties and 
methods for coupling fluorescent and quenching moieties to nucleotides, for example: Eckstein, 
editor, Oligonucleotides and Analogues: A Practical Approach (IRL Press, Oxford, 1991); Zuckennan 
et aL, Nucleic Acids Research, 15: 5305-5321 (1987) (3' thiol group on oligonucleotide); Shanna et 
aL, Nucleic Acids Research, 19: 3019 (1991) (3' suffliydryl); Giusti et aL, PCR Methods and 
Applications, 2: 223-227 (1993) and Fung et aL, U.S. Pat No. 4,757,141 (5' phosphoamrao group via 



35 Ammolink™. H available fi 
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Applied Biosystems, Foster City, Calif) Stabmsky, U.S. Pat No. 
4,739,044 (3' aininoaikylphosphoryl group); Agrawal et aL, Tetrahedron Letters, 3 1 : 1543-1546 
(1990) (attachment via phosphoramidate linkages); Sproat et aL, Nucleic Acids Research, 15: 4837 
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(1987) (5' mercapto group); Nelson et aL, Nucleic Acids Research, 17: 7187-7194 (1989) (3' amino 
group); and the like. 

[0158] Fluorescent-quenching pairs that can achieve intramolecular fluorescence quenching on a 
nucleotide include, for example, 9,10-dioxa^-3,4,6,7 r tetramethylbimane (bimane) and a halogen. 
5 Quenching efficiencies of halogen substituents on bimane fluorescence have been shown to increase 
in the order F < Cl< Br < I, in certain compounds. Sato et aL 1994. Other quenching moieties that 
may be used include 4<4 f -<limethylaminophenylazo>benzoic acid (DABCYL ), dinitrophenyl (DNP), 
and trinitrophenyl (TNP). 

[0159] As mentioned above, there is a great deal of practical guidance available in the literature 

10 providing an exhaustive list of fluorescent molecules and their relevant optical properties. Further, 
there is extensive guidance in the literature for derivatizing fluorescent and quenching moieties for 
covalent attachment via common reactive groups to a nucleotide, as exemplified by the following 
references: Haugland (supra); Ullman et aL, U.S. Pat. No. 3,996,345; Khanna et aL, U.S. Pat. No. 
4,351,760. Many suitable forms of these compounds are also available commercially. 

1 5 [0160] Using fluorescently-labeled nucleotides bearing a non-a-phosphate quencher helps reduce 
background signals when detecting incorporated nucleotides. In such embodiments, detection of 
incorporation depends on 'burning on" a fluorescent signal by de-quenching a moiety as the 
nucleotide becomes incorporated and non-oi-phosphates released. Unincorporated nucleotides, 
however, remain quenched, thereby reducing background signal. 

20 [0161] Efficient quenching can further help reduce background fluorescence. That is, incomplete 
quenching would result in low level background from each unincorporated molecule. In single 
molecule detection, high quenching efficiency is advantageous as it helps reduce background, 
enhancing the signal-to-noise ratio to permit detection of a single incorporated fluorescent moiety into 
a single complementary strand. 

25 [0162] In some embodiments, the fluorescent moiety on an unincorporated nucleotide exists 

quenched with at least about a 2 fold, at least about a 3 fold, at least about a 4 fold, or at least about a 
5 fold quenching efficiency compared to when the p- and/or y-phosphates are detached from the 
nucleotide. In some embodiments the quenching efficiency is at least about 10 fold, at least about 20 
fold, at least about 30 fold, at least about 40 fold, at least about 50 fold, at least about 100 fold, at least 

30 about 1 50 fold, at least about 200 fold, at least about 250 fold, at least about 300 fold, at least about 
350 fold, at least about 400 fold at least about 450 fold, at least about 500 fold, at least about 550 fold, 
at least about 600 fold, at least about 650 fold, at least about 700 fold, at least about 750 fold, at least 
about 800 fold, at least about 900 fold, at least about 950 fold, and at least about 1000 fold 
DABCYL, for example, quenches fluorescence from a wide variety of fluorescent moieties emitting 

3 5 between about 475 nm and about 805 run, and has shown efficiencies ranging from about 90 to about 
99.9% (see, S. Tyagi et aL, Nat BiotechnoL 16, 49 (1998); and G. T. Wang et aL, Tetrahedron Lett 
31,6493 (1990)). 
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Reacttve and Enzvmatic La beling Moieties 

[0163] Certain embodiments use labeling moieties mat only become detectable upon further 
reaction, for example reaction with another moiety. Nucleotides bearing such a labeling moiety can 
therefore remain undetectable by a given detection means until allowed to undergo reaction, for 
5 example, after incorporation. This can help reduce background interference from unincorporated 
nucleotides when detecting incorporated nucleotides, as the free nucleotides may be removed from the 
polymerization complex before the reaction is allowed to proceed. 

[0164] Embodiments utilizing reactive and/or enzymatic labeling moieties may be used with bulk 
methods of sequencing, for example bulk single base extensions. In some embodiments, refinements 
10 in the techniques may also allow for application to single molecule detection using reactive and/or 
enzymatic labeling moieties. 

[0165] For example, some embodiments use nucleotides comprising a reactive moiety that can 
undergo a reaction, for example, following incorporation, to create a detectable product In such 
embodiments, detection of the product can identify incorporation of the nucleotide. Reactive moieties 

1 5 include, for example, biotin as in biotin-dUTP, which can bind to streptavidm. Steptavidin in turn 
may be conjugated to an enzymatic moiety. The enzymatic moiety-conjugated streptavidin can be 
added to the biotin-labeled nucleotides after incorporated into a growing complementary strand, 
whereupon the enzymatic moiety may become bound to sites of incorporation. Addition of a 
substrate for the enzymatic moiety followed by detection of the product produced can identify 

20 incorporation, 

[0166] Enzyme moieties can be selected that act on a substrate to produce a colored or otherwise 
easily detectable product. Examples include horseradish peroxidase (HRP) mat catalyzes an oxidation 
reaction, changing a clear substrate to a colored product, as well as alkaline phosphatase, 
galactosidase, lucif erase, or acetylcholinesterase. 

25 [0167] Various binding pairs also may be used, where one member of the pair attaches at any 
position to a nucleotide base, sugar, and/or a-phosphate and the nucleotide remains capable of base- 
complementary incorporation by a polymerizing agent into a growing complementary strand. As well 
as biotin with streptavidin, biotin with avidin, digpxin with anti-digoxin, fluorescein with anti- 
fluorescein antibody, and the like may be used For example, biotin-dUTP, digixoin-dUTP and 

30 fluorescein-dUTP, are known in the art. As an illustration, these could be detected using horse 
peroxidase-conjugated streptavidin, horseradish peroxidase-conjugated antidigoxin; and alkaline 
phosphatase-conjugated anti-fluorescent antibody, respectively. 

[0168] In certain embodiments, nucleotides comprising an enzymatic moiety can be used, where 
the enzymatic moiety catalyzes a reaction, for example, following incorporation, to create a detectable 
35 product Detection of the product can identify incorporation of the nucleotide. Again the enzyme 
moiety can be selected for its ability to act on a substrate to produce a colored or otherwise easily 
detectable product Examples include horseradish peroxidase (HRP), alkaline phosphatase, 
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galactosidase, lucif erase, or acetylcholinesterase. These and other enzymatic labeling moieties known 
in the art may also be used, where the moiety can attach to any position of the nucleotide base, sugar, 
or a-phosphate, and the nucleotide remains capable of base-complementary incorporation by a 

ry strand. 



polymerizing agent into a growing compl< 
Labeling a Fraction of the Nucleotides 

[0169] In certain embodiments where mere are multiple copies in each template molecule 
immobilized on the surface (e.g. on the surface of a synthesis channel), only a small percentage of 
labeled nucleotides is sufficient for detection. For example, a radioactive label can be determined by 
counting or any other method known in the art, while fluorescent labels can be induces to fluoresce, 
e.g., by excitation. For fluorescently-labeled nucleotides, the percentage of labeled nucleotides can be 
less than about 20%, less than about 10%, less than about 5%, less than about 1%, less than about 
0.1%, less than about 0.01%, or less than about 0.001% of the total labeled and unlabeled nucleotides 
for each type of the nucleotides. 



[0170] In certain ei 



nu:iiimi;irj 



nts, a certain degree of stalling or slowing down of incorporation is 
desired, e.g. , in methods for "choking" the polymerizing agent and/or in methods utilizing short cycle 
sequencing. In bulk embodiments of such methods, the percentage of labeled nucleotides may be 
varied to obtain a desired degree of choking and/or of slowing down to prevent or hinder 
incorporation accordingly. 
C. Blocking Moieties 

[0171] In some embodiments, it may be desirable to employ blocking moieties in the primer 
extension reaction (see, eg., Dower et al, U.S. Pat No. 5,902,723), to form chain elongation 
inhibitors. Chain elongation inhibitors are nucleotide analogues which carry either chain terminating 
moieties or chain elongation inhibiting moieties, which prevent or hinder further addition by the 
polymerizing agent of nucleotides to the 3' end of the chain by becoming incorporated into the chain 
themselves, or are choking moieties mat inhibit further chain elongation by steric hindrance. In some 
embodiments, the chain elongation inhibitors are dideoxynucleotides. Where the chain elongation 
inhibitors are incorporated into the growing polynucleotide chain, they can be removed after 
incorporation of the nucleotide has been detected, in order to allow the polymerization reaction to 
proceed using further nucleotides. Some 3' to 5' exonucleases, e.g., exonuclease ID, are able to 
remove dideoxynucleotides. 

[0172] Other than dideoxynucleotides, a blocking moiety can be employed on the 3' moiety of the 

ts, 



I im«:«ini[^i 



deoxyribose group of a nucleotide to prevent or inhibit further incorporation. In certain e 
the blocking moiety can be removable under mild conditions (eg. , using photosensitive, weak acid 
labile, or weak base labile groups), thereby allowing for further elongation of the primer strand in the 
next synthetic cycle. If the blocking moiety also contains a labeling moiety, the dual blocking and 
labeling functions can be achieved without the need for separate reactions for the separate moieties. 
For example, a nucleotide can be labeled by attachment of a fluorescent dye group to the 3' moiety of 
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[0178] Choking moieties may attach to any position on me nucleotide base, sugar, or phosphate, 
where the nucleotide remains capable of base-complementary incorporation by a polymerizing agent 



into a growing polynucleotide strand In certain 



lents, the label is attached to the base, where 



it better distorts the double helix of the synthesized molecule, thereby inhibiting further polymerizing 
activity. For example, Krider et al., "2' modified nucleosides for site-specific labeling of 
oligonucleotides" Bioconjug. Chem. Jan-Feb 13(l):155-62 (2002), describes the synthesis of 2' 
modified nucleosides designed specifically for incorporating labels into oligonucleotides These 
methods can be used to attach sufficiently large labeling moieties to the 2' site to cause choking. 
Similar methods can be used to attach labeling moieties to the V base position, the 2' base position, 
the 4* base position, die 5' base position, the sugar moiety, the alpha phosphate, the beta phosphate, or 



the 



a phosphate. 

[0179] Crystal structures of several DNA polymerases have been described. See e.g. Doublie et 
al., "Crystal Structure of a Bacteriophage T7 DNA Replication Complex at 2.2 Angstrom 
Resolution," Nature 391:251-258 (1998); Ollis, D.L., Brick, P., Hamlin, R. Xuong, N.G. & Steitz, 
T.A., "Structure of large fragments of Eschericia coli DNA polymerase I complexed with dTMP", 
Nature 313, 762-766 (1985) (crystal structure of the Klenow fragment ofE. coli Pol I); Beese, L.S., 
Derbyshire, V. & Steitz, T.A., "Structure of DNA Polymerase I Klenow fragment bound to duplex 
DNA," Science 260, 352-355 (1993) (crystal structure of the Klenow fragment of E. coli Pol I); 
Korolev, S., Nayal, M, Barnes, W.M., Di Cera, E. & Waksman, G. "Crystal structure of the large 
fragment of Thermus aquations DNA polymerase I at 2.5-A resolution: structural basis for 
thermostability," Proc. Natl. Acad. Sci. USA 92, 9264-9268 (1995) (crystal structure ofthe analogous 
fragments of Thennus aquaticus DNA polymerase); Kiefer, J.R. et al., "Crystal structure of a 
thermostable Bacillus DNA polymerase I large fragment at 2.1 A resolution," Structure 5, 95-108 
(1997); and Kim, Y. et al., "Crystal structure of Thermus aquaticus DNA polymerase," Nature 376, 
612-616 (1995). Hie dimensions of these structures can indicate what size labeling moieties cause a 
given polymerizing agent to choke. Preferably, the labels are as bulky as Cy5, with molecular 
weights at least about 1 .5 kDa. More preferably, the labels are bulkier than Cy5, having molecular 
weights of at least about 1 .6 kDa, at least about 1 .7 kDa, at least about 1 .8 kDa, at least about 1 .9 kDa, 
at least about 2.0 kDa, at least about 2.5 kDa, or even at least about 3.0 Kda. 
[0180] Further examples of such larger dyes include the following, with corresponding formula 
weights (in g/mol) in parentheses: Cy5 (534,6); Pyrene (535.6); 6-Ctorboxyfluorescein (FAM) (537.5); 
oXJarboxyfluorescein-DMT (FAM-X) (537.5); 5(6>Carboxyfluorescein (FAM) (537.5); 5- 
Fluorescein (FTTC) (537.6); Cy3B (543.0); WellRED D4-PA (544.8); BODIPY 630/650 (545.5); 3' 
6-Carboxyfluorescein (FAM) (569.5); Cy3.5 (576.7); Cascade Blue (580.0); Alexa Fluor 430 (586.8); 
Lucifer Yellow (605.5); Alexa Fluor 532 (608.8); WellRED D2-PA (61 1.0); Cy5.5 (634.8); DY-630 
(634.8); DY-555 (6362); WellRED D3-PA (645.0); Rfwdamine Red-X (654.0); DY-730 (660.9); 
DY-782 (660.9); DY-550 (667.8); DY-610 (667.8); DY-700 (668.9); 6-Tetrachlorofluorescein (TET) 
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(675.2) Alexa Fhior 568 (676.8); DY-650 (686.9); 5(6)-Carboxyeosin (689.0); Texas ReoVX (702.0); 
Alexa Fluor 594 (704.9); DY-675 (706.9); DY-750 (713.0); DY-681 (736.9); Hexachlorofluorescein 
(HEX) (744.1); DY-633 (751.9); IightCycler Red 705 (753.0); IightCycler Red 640 (758.0); DY-636 
(760.9); DY-701 (770.9); FAR-Fuchsia (5'-Aroidite) (776.0); FAR-Fuchsia (SE) (776.0); DY-676 

5 (808.0); Erylhrosin (814); FAR-Blue (5'-Anridite) (824.0); FAR-Bhie (SE) (824.0); Oyster 556 

(850.0); Oyster 656 (900.0); FAR-Green Two (SE) (960.0); Alexa Fluor 546 (964.4); FAR-Green One 
(SE), (976.0); Alexa Fluor 660 (985.0); Oyster 645 (1000.0); Alexa Fluor 680 (1035.0); Alexa Fluor 
633 (1085.0); Alexa Fluor 555 (1 135.0); Alexa Fluor 647 (1 185.0); Alexa Fluor 750 (1 185.0); Alexa 
Fluor 700 (1285.0). These reagents are commercially available from SYNTHEGEN, LLC (10590 

1 0 Westoffice Drive, Suite 200, Houston, Texas 77042), for example, or can be synthesized by 
appropriate methods. 

Sanger-like Sequencing using Choking Moieties 

[0181] While many of the innovations of the present invention relate to single molecule 
sequencing, certain advances herein described also facilitate alternate or improved methods of 

1 5 carrying out classical bulk sequencing. 

[0182] For example, another aspect of the present invention relates to an alternate method of doing 
classic Sanger sequencing (a method of bulk sequencing). This aspect involves a method for 
sequencing a target polynucleotide without using ddNTP's, by providing four reaction mixtures, each 
comprising a primed target polynucleotide; a polymerizing agent; and four nucleotides, wherein a 

20 proportion of one of the four nucleotides in each mixture icomprises a moiety that inhibits further 

chain elongation by steric hindrance; allowing incorporation of the nucleotides into a complementary 
strand until a nucleotide preventing further chain elongation becomes incorporated; allowing 
repetition of the above step to obtain a plurality of complementary strands of varying lengths; and 
size-sorting the plurality of strands to analyze the sequence of the target polynucleotide. 

25 [0183] The proportion of nucleotide types bearing a chain elongation inhibiting moiety can be 

selected to allow limited chain termination. Thus in a plurality of growing complementary strands, a 
nucleotide comprising an inhibiting moiety will become incorporated at different positions along the 
sequence where that particular nucleotide type appears. This results in a plurality complementary 
strands of varying lengths, terminating at the positions where an inhibiting moiety became 

30 incorporated. Preferably, this produces a ladder of strands, each one nucleotide longer than the other. 
Sanger et al. FNAS 74: 5463 (1977). 

[0184] In some embodiments, a nucleotide bears a labeling moiety, which becomes incorporated 
into the growing complementary strands. In other embodiments, a primer bearing a labeling moiety is 
used to prime the target polynucleotide. The labeling moiety may involve any of the detection 
35 approaches described herein, or any other suitable labeling technique known in the art The labeling 
moiety facilitates detection of the complementary strands, for example during size-sorting. 



WO 2005/080605 PCT/US2005/004258 

-35- 

[0185] The complementary strands of varying lengths can be size-sorted by any known methods 
known in the art to resolve different length strands, including various electrophoresis techniques, 
including polyacrylamide gel electrophoresis (PAGE), ultra-thin slab gel electrophoresis, capillary 
array electrophoresis, and automatic gel readers. To facilitate determination of the nucleotide type 
terminating each strand, a detectably distinguishable labeling moiety may be used in each of the four 
reaction mixtures, or the strands from the different reaction mixtures may be sorted in different gels or 

« 









II 





resolved using mass spectroscopy. For a review of some of these methods, see Chen "High-Speed 
DNA-Sequence Analysis," Prog. Biochem, Biophys. 22: 223-227 (1995). 
D. Removal of Labeling and Blocking Moieties 

[0186] By carrying out the incorporation and detection steps, one or more nucleotides on the target 
polynucleotide adjacent to the 3' end of the primer can be identified. Once this has been achieved, 
labeling moiety may be removed before repeating the cycle to discover the identity of the next 
nucleotide or nucleotides. Removal of the labeling moiety can be effected by removal of the labeled 
nucleotide itself, using a 3'-5' exonuclease, for example, and subsequent replacement with an 
unlabeled nucleotide. 

[0187] Alternatively, the labeling moiety can be removed from the nucleotide. Release of a 
fluorescent dye, for example, can be achieved if a detachable connection between the nucleotide and 
the fluorescent molecule is used. For example, the use of disulfide bonds enables one to disconnect 
the dye by applying a reducing agent like mthiothreitol (DTI}. The connection may also be detached 
by other chemical means, as well as by enzymatic and/or photochemical means. 
[0188] In a further alternative, where the labeling moiety is a fluorescent moiety, it is possible to 
neutralize the fluorescence by bleaching it with radiation. Photobleaching can be performed 
according to methods, e.g. , as described in Jacobson et al., "International Workshop on the 
Application of Fluorescence Photobleaching Techniques to Problems in Cell Biology," Federation 
Proceedings, 42:72-79, 1973; Okabe et al., J Cell Biol 120:1 177-86, 1993; Wedekind et al., J Microsc. 
176 (Pt 1): 23-33, 1994; and Close et al., Radiat Res 53:349-57, 1973. 

[0189] If choking and/or other blocking moieties have been used, these can be removed before the 
next cycle take place. 3' blocking moieties can be removed by photochemical, chemical or enzymatic 
cleavage of the blocking group from the nucleotide. For example, chain terminating moieties are 
removed with a 3-5' exonuclease, eg., exonuclease HL Once the labeling and blocking moieties have 
been removed, the cycle can be repeated to discover the identity of the next nucleotide or nucleotides. 
[0190] Similarly, if a labeling moiety sufficiently large to cause choking is used, the moiety, or 
steric hindering portion of the moiety, can be removed to allow chain elongation to resume. If the 
labeling moiety only causes choking after a small number of labeled nucleotides are incorporated, the 
moiety or a portion thereof may be removed only after every few incorporations. Choking labeling 
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moieties, or portions thereof, may be removed similarly as described above, i.e. by enzymatic, 
chemical, or photochemical means. 

[0191] Removal of the blocking moieties may be unnecessary if only a percentage of die 
nucleotides carry blocking moieties, e.g. , in certain bulk applications. In this approach, the chains 
5 incorporating the blocked nucleotides are permanently terminated and no longer participate in the 
elongation processes. In such embodiments, a small percentage of permanent loss in each cycle can 
be tolerated. 

[0192] La some embodiments, nucleotide incorporation is monitored by detection of pyrophosphate 
release (see, &g., W098/13523, W098/28440, and Ronaghi et al., Science 281 :363, 1998). 

1 0 Pyrophosphate is released upon incorporation of a deoxynucleotide or dideoxynucleotide, which can 
be detected enzymatically. For example, a pyrophosphate-detection enzyme cascade can be included 
in the reaction mixture in order to produce a cherno luminescent signal, hi some embodiments, this 
method employs no wash steps, instead relying on continual addition of reagents. Also, instead of or 
as well as deoxynucleotides or dideoxynucleotides, one or more nucleotide analogues can be used 

1 5 which are capable of acting as substrates for the polymerizing agent but incapable of acting as 
substrates for the pyrophosphate-detection enzyme. 
Removal ofNon-cleavable Labeling Moieties 

[0193] Certain embodiments of the invention provide a plurality (two or more) of nucleotide types, 
where a nucleotide bears both a non-cleavable labeling moiety and a blocking moiety. While most 

20 other groups have focused on cleavable labels, this approach uses bleaching instead That is, signal 
from incorporated labeling moiety may be neutralized or reduced after one or more incorporations 
into the complementary strand by bleaching, such as photo-bleaching or chemical bleaching, rather 
than by cleavage. As mentioned above, photobleaching can be performed according to methods, e.g., 
as described in Jacobson et al., "International Workshop on the Application of Fluorescence 

25 Photobleaching Techniques to Problems in Cell Biology," Federation Proceedings, 42 :72-79, 1973 ; 
Okabe et al., J Cell Biol 120:1177-86, 1993; Wedekind et al., J Microsc. 176 (Pt 1): 23-33, 1994; and 
Close et al., Radiat Res 53:349-57, 1973. 

[0194] "Non-cleavable" is used herein to indicate a chemical linkage that is particularly resistant to 
cleavage under the conditions used in the polymerization reactions and detection procedures, as well 
30 as any other reactions short of very harsh or unique conditions. That is, the connection between the 
labeling moiety and the nucleotide remains intact under the physical, chemical, and/or enzymatic 
conditions of the incorporation and detection steps, as well as any bleaching step used to reduce its 
signal. 

[0195] The labeling moiety may attach directly or indirectly to one or more positions on the 
35 nucleotide base, sugar, or a-phosphate, so long as it is stable and allows substrate recognition by the 
polymerizing agent Three-D structures of the polymerization site reveal sufficient space surrounding 
fee area of the 5 1 -position of a pyrimidine to allow for modification. For example, energy transfer 
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dyes at the 5 -position of the pyrimidines (Tand C) allow recognition and incorporation, as well as 
dyes at the 7-position of purine (G and A) (Rosenblum et al. 1997, Zhu et al. 1994). 
[0196] Non-cleavable linkages may include covalent or other types of bonds that require particular 
conditions for cleavage. Methoxy linkages, for example, require stringent anhydrous conditions, 

5 making it difficult to chemically cleave these linkages. Similarly, -O-ethoxy-nucleotides have been 
reported as good substrates for several polymerizing agents (Axelrod et al. 1978), thus providing 
another non-cleavable linkage for use in certain embodiments of the invention. 
[0197] Other examples of non-cleavable labeling moieties include fluorescein phophoramidites 
(FAM), digoxigenin-nucleotides, and mercurated nucleotide analogs. FAM dyes may be coupled to 

1 0 nucleotides, e.g., at a hydroxyl group. Theisen et al. 1992. Such dyes have been used in automated 
DNA synthesizers, where the dye and its linkage to an oligonucleotide have proven stable under 
polymerization and cleavage/deprotection conditions. Similarly, digoxigenin-1 1-dUTP can be 
incorporated in a growing polynucleotide strand and remains intact even under conditions of the 
polymerase chain reaction. Taveira et al. 1992. Further, mercury atom-bearing pyrirnidine 

1 5 nucleotides, have been shown to be heat- and thiol- stable and can be specifically incorporated into a 
growing complementary strand by polymerizing agents. Bridgman et al. 1996. 
Single Step Bleaching & Cleaving 

[0198] Certain embodiments of the present invention can reduce the number of steps needed for 
analyzing sequences by synthesis. For example, certain embodiments achieve reduction of 
20 incorporated signals along with reversal of chain termination in a single step, even where the labeling 
moiety and the blocking moiety are separate moieties. Such embodiments utilize nucleotides where a 

* 

nucleotide bears a labeling moiety and a blocking moiety on different positions on the nucleotide. By 
using a blocking moiety having a chemically-cleavable group, however, it is possible to chemically 
cleave the blocking moiety, thus reversing chain termination, while chemically bleaching incorporated 
25 signal in a single step. Similarly, by using a blocking moiety having a photo-cleavable group, it is 
possible to photo-cleave the blocking moiety, thus reversing chain terrnination, while photo-bleaching 
incorporated signal in a single step. 

[0199] The bleaching plus cleaving step may be performed after incorporation of about one, about 
two, about three, about four, about five, about six, about seven, about eight, about nine, or about ten 

30 incorporations. Using such an approach, bleaching and resumption of chain elongation may occur in 
a single step, even where the labeling moiety is a separate moiety from the blocking moiety and even 
where the labeling moiety attaches to the nucleotide via a non-cleavable linkage. 
[0200] In certain embodiments using chemical cleaving and bleaching, the blocking moiety is 
coupled to any position of the nucleotide by any linkage susceptible to cleavage by chemical means 

35 that also serves to chemically bleach incorporated labeling moiety. Attachment of the blocking 

moiety can be made to the base, sugar, or a-phosphate positions of the nucleotide, for example, with 
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try incorporation by a 



or without a linker, where the nucleotide remains capable of base-compl* 
polymerizing agent into a growing complementary strand. 

[0201] As noted above, disulfide linkage between a moiety and a nucleotide, for example, permits 
chemical cleavage using dithiothreitol (DTT). Thiol-modified nucleotides have also proved useful for 
cleavably-attaching a variety of moieties to nucleotides. Hanna et al., "Synthesis and Characterization 





wo 
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T7-RNA Polymerases," Nucleic Acids Res. 21 : 2073-2079 (1993). Any other suitable means of 
chemical cleavage may be used that does not damage the polymerization complex nor the linkages of 
the polynucleotides. 

[0202] In certain embodiments using photo cleaving and bleaching, the blocking moiety is coupled 
to any position of the nucleotide by any linkage susceptible to photo-cleavage, where the photo- 
radiation used also serves to photo-bleach the incorporated labeling moiety and the nucleotide remains 
capable of base-complementary incorporation by a polymerizing agent into a growing complementary 
strand. Typically, a wavelength equal to the wavelength of light absorbed by the fluorescent moiety 
can be used to photobleach it 

[0203] Attachment of the blocking moiety can be made to the base, sugar, or a-phosphate positions 
of the nucleotide, with or without a linker. Photocleavable linkers, such as linkers comprising a 2- 
nitrbenzyl moiety have been demonstrated Hasan, et al 1997; Li et al. 2003. Such linkers are stable 
under polymerization conditions, but are cleaved when subjected to UV irradiation at about 340 nn. 
Similarly, 9-phenylthioxanthyl, 9-(2-naphthyl)4bioxanthenol, and 9-(2-(6-methoxy)naphthyl)- 
thioxanthenol have been developed as photocleavable protecting groups for hydroxyl functionality of 
nucleosides. 

[0204] Radiation used to photocleave the blocking moiety can also bleach signals from fluorescent 
moieties, for example, using light, ultraviolet light and/or laser radiation of a wavelength absorbed by 
a fluorescent moiety. In some embodiments, UV irradiation of 340 nm, which cleaves 2-nitrobenzyl 



linkers, may also bleach fluorescent labels with similar absorption maxima (e.g. 4-{4-methoxybenzyi 
arnino)-7-nitrobenzofurazan, 5 -dimethy lamino naphthalene- 1 -sulfonyl chloride, dansyl cadaverine, 
and odo ace taminoethyl)-l-naphthylamine-5 -sulfonic acid. See 
http:www.signiaaldMch.com 
Fluoresent Probes/Labels Jitml. 



[0205] Furthermore, for FRET embodiments, bleaching radiation may be selected to extinguish 
si gnal from the acceptor fhiorophore but not the donor fluorophore, facilitating repeated used of the 
same donor moiety with different acceptor labeling moieties as they become incorporated into a 
growing complementary strand during sequencing analysis. For example, where Cy3 is used as the 
donor moiety and Cy5 is used as the acceptor moiety, a red laser of about 635 nm can be used to 
bleach the Cy5 acceptor, leaving the Cy3 donor unharmed. Quake et al., Sequencing information can 
be obtained from single DNA molecules, PNAS 100(7):3960-3964 (2003). 



WO 2005/080605 PCT/US2005/004258 

-39- 

[0206] The photobleaching radiation can be applied as a light pulse for a certain period of time to 
destroy or reduce incorporated signal. Hie light pulse is typically applied for about 50 seconds or 
less, about 30 seconds or less, about 20 seconds or less, about 15 seconds or less, about 10 seconds or 
less, about 5 seconds or less, about 3 seconds or less, about 1 second or less, about 0.5 seconds or less, 
5 about 02 seconds or less, and about 0.1 second or less. 
£. Reaction Conditions 

[0207] Hie reaction mixture for the polymerizing reactions may comprise an aqueous buffer 
medium, which maybe optimized for the particular polymerizing agent In general, the buffer can 
include a source of monovalent ions, a source of divalent cations and a buffering agent Any 

1 0 convenient source of monovalent ions, such as KC1, K-acetate, NH4-acetate, K-gJutamate, NH4C1, 
ammonium sulfate, and the like may be employed, where the amount of monovalent ion source 
present in the buffer will typically be present in an amount sufficient to provide a conductivity in a 
range from about 500 to about 20,000, usually from about 1000 to about 10,000, and more usually 
from about 3,000 to about 6,000 micromhs. 

1 5 [0208] The divalent cation may be magnesium, manganese, zinc and the like. Any convenient 
source of magnesium cation may be employed, including MgCl 2 , Mg-acetate, and the like. The 
amount of Mg ion present in the buffer may range from about 0.5 to about 20 mM, from about 1 to 
about 12 mM, from about 2 to about 10 mM, or about 5 mM 

[0209] Representative buffering agents or salts that may be present in the buffer include Tris, 

20 Tricine, HEPES, MOPS and the like, where the amount of buffering agent will typically range from 
about 5 to about 150 mM, from about 10 to about 100 mM, or from about 20 to about 50 mM. In 
certain embodiments, the buffering agent will be present in an amount sufficient to provide a pH 
ranging from about 6.0 to about 9.5, including a pH about 7.6 at about 25° C. Other agents which 
may be present in the buffer medium include chelating agents, such as EDTA, EGTA and the like. 

25 G. Sample Housing 

[0210] The substrate can be housed in a flow chamber having an inlet and outlet to allow for 
renewal of reactants which flow past the immobilized moieties. The flow chamber can be made of 
plastic, glass, membrane material or gel, and can either be open or transparent in the plane viewed by 
the microscope or optical reader. Electro-osmotic flow can be achieved by a fixed charge on the 

30 substrate and a voltage gradient (current) passing between two electrodes placed at opposing ends of 
the support Pressure driven flow can be facilitated by microfluidic device with an external pressure 
source or by microfluidic peristaltic pump (see, eg., Unger et al., Science 288: 1 13-1 16,2000). 
[0211] The flow chamber can be divided into multiple channels for separate polymerization 
reactions. Examples of micro flow chambers are described in Fu et al. Nat Biotechnol. (1999) 

35 17:1109, which describes a micro-fabricated fluorescence-activated cell sorter with 3 um * 4 Jim 

channels that utilizes electro-osmotic flow for sorting. In certain embodiments, the flow chamber can 
contain micro-fabricated synthesis channels as described in WO01/32930. The polynucleotide 
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templates or primers can be immobilized to the surface of the synthesis channels. These synthesis 
channels can be in fluid communication with a microfluidic device, which controls flow of reaction 
reagents. As an example, microfluidic devices mat can be employed to control flow of reaction 
reagents in the present invention have been described in WO01/32930. 
5 [0212] The present invention also provides apparatuses for carrying out the methods of the 
invention. Other than the substrate to which the target polynucleotides or primers are attached, the 
apparatuses usually comprise a flow chamber in which the substrate is housed In addition, the 
apparatuses can optionally contain plumbing devices (e.g., an inlet and an outlet port), a light source, 
and a detection system described herein. For example, a microfabricated apparatus as described in 
1 0 WO01/32930 can be adapted to house the substrate of the present invention, as described below: 
1. Preferred embodiments of the Apparatuses 
a, Basic Features of the Apparatuses 

[0213] Certain embodiments of the flow chambers of the present invention can comprise micro- 
fabricated channels to which polynucleotide templates or primers are attached Optionally, the 
1 5 apparatuses comprise plumbing components (e.g. , pumps, valves, and connecting channels) for 
flowing reaction reagents. The apparatuses can also comprise an array of reservoirs for storing 
reaction reagents (e.g. , the polymerizing agent, each type of nucleotide, and other reagents can each 
be stored in a different reservoir). 

[0214] The micro-fabricated components of the apparatuses can all have a basic "flow channel" 

20 structure. The term "flow channel" or "micro-fabricated flow channer refers to a recess in a 

structure, which can contain a flow of fluid or gas. The polynucleotides can be attached to the interior 
surface of micro-febricated channels in which synthesis occurs. For consistency and clarity, the flow 
channels are termed "synthesis channels 9 ' when referring to such specific use. The micro-fabricated 
flow channels can also be actuated to function as plumbing components (e.g., micro-pumps, micro* 

25 valves, or connecting channels) of the apparatuses. 

[0215] In some applications, micro-fabricated flow channels are cast on a chip (eg., a elastomeric 
chip). Synthesis channels are formed by bonding the chip to a flat substrate (e.g., a glass cover slip), 
which seals the channel. Thus, one side of the synthesis channel is provided by the flat substrate. 
Typically, the polynucleotide templates or primers are attached to the interior surface of the substrate 

30 within the synthesis channel 

[0216] The plumbing components can be micro-febricated as described in the present invention 
For example, the apparatuses can contain, in an integrated system, a flow cell in which a plurality of 
synthesis channels and fruidic components (such as micro-pumps, micro-valves, and connecting 
channels) for controlling the flow of the reagents into and out of the flow cell are present 

3 5 Alternatively, the sequencing apparatuses of the present invention utilize plumbing devices described 
in, e.g., Zdeblick etaL, A Microminiature Electric-to-Fhiidic Valve, Proceedings of the 4th 
International Conference on Solid State Transducers and Actuators, 1987; Shoji et al., "Smallest Dead 
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Volume Microvalves for Integrated QiemicaJ Analyzing Systems," Proceedings of Transducers *9 1, 
San Francisco, 1991; Vieider et al., "A Pneumatically Actuated Micro Valve with a Silicon Rubber 
Membrane for Integration with Fluid Handling Systems," Proceedings of Transducers «95, Stockholm, 
1995. 

5 [0217] As noted above, at least some of the components of the apparatuses are micro-fabricated. 
Micro-fabrication refers to feature dimensions on the micron level, with at least one dimension of the 
micro-fabricated structure being less than about 1000 yin. In some apparatuses, only the synthesis 
channels are micro-fabricated In some apparatuses, in addition to the synthesis channels, the valves, 
pumps, and connecting channels are also micro-fabricated. Unless otherwise specified, the discussion 
1 0 below of micro-fabrication is applicable to production of all micro-fabricated components of the 
apparatuses (e.g. , the synthesis channels in which polymerization reactions occur, and the valves, 
pumps, and connecting channels for controlling reagent flow to die synthesis channels). Employment 
of micro-fabricated synthesis channels and/or micro-fabricated plumbing components significantly 
reduce the dead volume and decrease die amount of time needed to exchange reagents, which in turn 
1 5 increase throughput. 

[0218] In general, the micro-fabricated structures (e.g., synthesis channels, pumps, valves, and 
connecting channels) have widths of about 0.01 to about 1000 microns, and a width-to depth ratio of 
between about 0.1 : 1 to about 100: 1. Preferably, die width is in the range of about 10 to about 200 
microns, with a width-to-deplh ratio of about 3:1 to about 15: 1. 
20 b. Non-elastomer Based Apparatuses 

[0219] As discussed above, while elastomers are preferred materials for fabricating the apparatuses 
of the present invention, non-elastomer based microfluidic devices can also be used in die apparatuses 
of the present invention. In some applications, the apparatuses utilize microfluidics based on 
conventional micro-electromechanical system (MEMS) technology. Methods of producing 
25 conventional MEMS microfluidic systems such as bulk micro-rnachining and surface micro- 
machining have been described, e.g., in Terry et al., A Gas Chromatographic Air Analyzer Fabricated 
on a Silicon Wafer, IEEE Trans, on Electron Devices, v. ED-26, pp. 1880-1886, 1979; and Berg et al., 
Micro Total Analysis Systems, New York, Kluwer, 1994. 

[0220] Bulk micro-machining is a subtractive fabrication method whereby single crystal silicon is 
30 lithographically patterned and then etched to form three-dimensional structures. For example, bulk 
micromachining technology, which includes the use of glass wafer processing, or silicon-to-glass 
wafer bonding, has been commonly used to fabricate individual microfluidic components. This glass- 
bonding technology has also been used to fabricate microfluidic systems. 
[0221] Surface micro-machining is an additive method where layers of semiconductor-type 
35 materials such as polysilicon, silicon nitride, silicon dioxide, and various metals are sequentially 

added and patterned to make three-dimensional structures. Surface micromachining technology can be 
used to fabricate individual fluidic components as well as microfluidic systems with on-chip 
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electronics. In addition, unlike bonded-type devices, hermetic channels can be built in a relatively 
simple manner using channel walls made of polysificon (see, e.g., Webster et aL, Monolithic 
Capillary Gel Electrophoresis Stage with On-Chip Detector, in International Conference on Micro 
Electromechanical Systems, MEMS 96, pp. 491-496, 1996), silicon nitride (see, e.g., Mastrangelo et 
5 al., Vacuum-Sealed Silicon Micromachined Incandescent Light Source, in Intl. Electron Devices 
Meeting, IDEM 89, pp. 503,506, 1989), silicon dioxide and the like. 

[0222] In some applications, electrokinetic flow based microfluidics can be employed in the 
apparatuses of the present invention. Briefly, these systems direct reagents flow within an 
interconnected channel and/or chamber containing structure through the application of electrical fields 
10 to the reagents. The electrokinetic systems concomitantly regulate voltage gradients applied across at 
least two intersecting channels. Such systems are described, e.g., in WO 96/04547 and U.S. Patent 
No. 6,107,044. 

[0223] An exemplary electrokinetic flow based microfluidic device can have a body structure 
winch includes at least two intersecting channels or fluid conduits, e.g., interconnected, enclosed 

1 5 chambers, which channels include at least three unintersected termini. The intersection of two 

channels refers to a point at which two or more channels are in fluid communication with each other, 
and encompasses "T" intersections, cross intersections, * 'wagon wheel" intersections of multiple 
channels, or any other channel geometry where two or more channels are in such fluid 
communication. An unintersected terminus of a channel is a point at which a channel terminates not 

20 as a result of that channel's intersection with another channel, e.g., a "T" intersection. 

[0224] In some electrokinetic flow based apparatuses, at least three intersecting channels having at 
least four unintersected termini are present. In a basic cross channel structure, where a single 
horizontal channel is intersected and crossed by a single vertical channel, controlled electrokinetic 
transport operates to direct reagent flow through the intersection, by providing ccmstraming flows 

25 from the other channels at the intersection. Simple electrokinetic flow of this reagent across the 

intersection could be accomplished by applying a voltage gradient across the length of the horizontal 
channel, i.e., applying a first voltage to the left terminus of this channel, and a second, lower voltage 
to the right terminus of this channel, or by allowing the right terminus to float (applying no voltage). 
[0225] In some other applications, the apparatus comprises a micro-fabricated flow cell with 

30 external mini-fhiidics.. The glass cover slip can be anodically bonded to the surface of the flow cell. 
The interrogation region is lOOum x lOOum x lOOjim, while the input and output channels are 100 urn 
x lOOum x 100pm. Holes for the attachment of plumbing are etched at the ends of the channels. For 
such apparatuses, the fluidics can be external Plumbing can be performed with standard HPLC 
components, e.g., from Upchurch and Hamilton. In the interrogation region, the polynucleotide 

3 5 template or primer can be attached to the surface with standard avidin-biotin chemistry, for example. 
[0226] Multiple copies of templates can be attached to the apparatus. For example, for a 7 kb 
template, the radius of gyration is approximately 0.2um. Therefore, about 105 molecules can be 
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e.g., an Upchurch six-port injection valve and driven by, eg, a Thar Designs motor. Fluid can be 
pumped with a syringe pump. Hie detection system can be an external optical microscope, with the 
objective being in close proximity to the glass cover slip. 
5 V. Detection of Incorporated Signals 
A* Detection System in General 

[0227] Certain embodiments of Ihe present invention provide for detection of a single nucleotide 
into a singie target polynucleotide. A number of methods are available for mis purpose (see, e.g., Nie 
et al., Science 266: 1013, 1994; Funatsu et al., Nature 374: 555, 1995; Mertz et al., Optics Letters 20: 

10 2532, 1995; and Unger et al., Biotechniques 27:1008, 1999). Methods for visualizing single 

molecules of polynucleotides labeled with an intercalating dye include, e.g., fluorescence microscopy 
as described in Houseal et al., Biophysical Journal 56: 507, 1989. Even the fluorescent spectrum and 
lifetime of a single molecule excited-state can be measured (Macklin et al., Science 272: 255, 1996). 
Standard detectors such as a photomultiplier tube or avalanche photodiode can be used. Full field 

1 5 imaging with a two-stage image intensified CCD camera can also used (Funatsu et al., supra). 

Additionally, low noise cooled CCD can also be used to detect single fluorescent molecules (see, e.g. , 
Unger et al., Biotechniques 27: 1008-1013, 1999; and SenSys spec: 
http://www.photomet.com/p^ 

[0228] The detection system for the signal may depend upon the labeling moiety used, which can 
20 be defined by the chemistry available. For optical signals, a combination of an optical fiber or 
charged couple device (CCD) can be used in the detection step. In those circumstances where the 
substrate is itself transparent to the radiation used, it is possible to have an incident light beam pass 
through the substrate with the detector located opposite the substrate from the polynucleotides. For 
electromagnetic labeling moieties, various forms of spectroscopy systems can be used. Various 
25 physical orientations for the detection system are available and discussion of important design 

parameters is provided in the art (e.g, Amdt-Jovin et al, J Cell Biol 101 : 1422-33, 1985; and Marriott 
et al., Biophys J 60: 1374-87, 1991). 

[0229] A number of approaches can be used to detect incorporation of fluorescently-labeled 
nucleotides into a single polynucleotide molecule. Optical setups include near-field scanning 

30 microscopy, far-field confocal microscopy, wide-field epi-iUumination, light scattering, dark field 
microscopy, photoconversion, single and/or multrphoton excitation, spectral wavelength 
discrimination, fluorophore identification, evanescent wave illumination, and total internal reflection 
fluorescence (TTRF) microscopy. General reviews are available describing these technologies, 
including, e.g, Basche et al, eds., 1996, Single molecule optical detection, 

35 spectroscopy, WeinheinxVCM; and Plakhotnik, et al., Singje-molecule spectroscopy, Arm. Rev. 
Phys, Chem. 48: 181-212. In general, the methods involve detection of laser-activated fluorescence 
using a microscope equipped with a camera. It is sometimes referred to as a high-efficiency photon 
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detection system (see, e.g., Nie, et al., 1994, Probing individual molecules with confocal fluorescence 
microscopy, Science 266:1018-1019). Other suitable detection systems are discussed in the Examples 
below. 

[0230] Suitable photon detection systems include, but are not limited to, photodiodes and 

5 intensified CCD cameras. For example, an intensified charge couple device (ICCD) camera can be 
used. The use of an ICCD camera to image individual fluorescent dye molecules in a fluid near a 
surface provides numerous advantages. For example, with an ICCD optical setup, it is possible to 
acquire a sequence of images (movies) of fluorophores. 
B. Total Internal Reflection Fluorescence (TIRF) Microscopy 

1 0 [0231] Some embodiments of the present invention use total internal reflection fluorescence 

(TIRF) microscopy for two-dimensional imaging. TTRF microscopy uses totally internally reflected 
excitation light and is well known in the ait See, e.g., Watkins et al., J Biomed Mater Res 1 1:915-38, 
1977; and Axelrod et al., J Microsc, 129:19-28, 1983. In certain embodiments, detection is carried 
out using evanescent wave Ulumination and total internal reflection fluorescence microscopy. An 

1 5 evanescent light field can be set up at the surface, for example, to image fluorescently-labeled 

polynucleotide molecules. When a laser beam is totally reflected at the interface between a liquid and 
a solid substrate (e.g., a glass), the excitation light beam penetrates only a short distance into the 
liquid. In other words, the optical field does not end abruptly at the reflective interface, but its 
intensity Ms off exponentially with distance. This surface electromagnetic field, called the > 

20 'evanescent wave', can selectively excite fluorescent molecules in the liquid near the interface. The 
thin evanescent optical field at the interface provides low background and facilitates the detection of 
single molecules with high signal-to-noise ratio at visible wavelengths (see, M. Tokunaga et al., 
Biochem. and Biophys. Res. Comm. 235, 47 (1997) and P. Ambrose, Cytometry, 36, 244 (1999)). 
[0232] The evanescent field can also image fluorescently-labeled nucleotides upon their 

25 incorporation into the immobilized target polynucleotide-prirner complex in the presence of a 
polymerizing agent Total internal reflection (TIR) fluorescence microscopy can then be used to 
visualize the immobilized target polynucleotide-prirner complex and/or the incorporated nucleotides 
with single molecule resolution. With TER technology, the excitation light (e.g, , a laser beam) 
illuminates only a small volume of solution close to the substrate, called the excitation zone. Signals 

30 from free (unincorporated) nucleotides in solution outside tie excitation zone would not be detected. 
Signals from free nucleotides that diffuse into the excitation zone would appear as a broad band 
background because die free nucleotides move quickly across the excitation zone. 
[0233] TIRF microscopy has been used to examine various molecular or cellular activities, e.g. , 
cell/substrate contact regions of primary cultured rat myotubes with acetylcholine receptors labeled by 

35 fluorescent alpha-bungarotoxin, and human skin fibroblasts labeled with a membrane-incorporated 
fluorescent lipid (see, e.g., Thompson et al., Biophys J. 33:435-54, 1981; Axelrod, J, Cell. Biol. 89: 
141-5, 1981; and Burghardt et aL, Biochemistry 22:979-85, 1983). TIRF examination of cell/surface 
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contacts dramatically reduces background from surface antofluorescence and debris. TIRF has also 
been combined with fluorescence photobleaching recovery and correlation spectroscopy to measure 
the chemical kinetic binding rates and surface diffusion constant of fluorescent labeled serum protein 
binding to a surface at equilibrium (see, e.g. f Burghardt et at, Biophys J. 33:455-67, 1981); and 
Thompson et al., Biophys J, 43:10344, 1983). Additional examples of TIRF detection of single 
molecules have been described in Vale et al., 1996, Direct observation of single kinesin molecules 
moving along microtubules, Nature 380: 451; and Xu et al., 1997, Direct Measurement of Single- 
Molecule Diffusion and Photodecomposition in Free Solution, Science 275: 1 106-1 109. 
[0234] The penetration of the field beyond the substrate depends on the wavelength and the laser 
beam angle of incidence. Deeper penetrance is obtained for longer wavelengths and for smaller 
angles to the surface normal within the limit of the critical angle. In typical assays, fluorophores are 
detected within about 200 nm from the surface, which corresponds to the contour length of about 600 
base pairs of a polynucleotide. In some embodiments, when longer polynucleotide templates are 
analyzed, the polymerizing agent rather than the template or primer can be immobilized to the surface 
so that reaction occurs near the surface at all times. In some embodiments, a prism-type TIRF 
geometry for single-molecule imaging, as described by Xu and Yeung, is used (see, X-H.N. Xu et al., 



Science, 281, 1650 (1998)). In some embodiments, an objective type TIRF is used to provide space 
above the objective so mat a microfluidic device can be used (see, e.g., Tokunaga et al,, Biochem 
Biophy Res Commu 235: 47-53, 1997; Ambrose et al., Cytometry 36:224;1999; and Braslavsky et al, 
Applied Optica 40:5650, 2001), . 

[0235] Total internal reflection can be utilized with high numerical aperture objectives (ranging 
between about 1 .4 and about 1.65 in aperture), for example, using an inverted microscope. The 
numerical aperture of an objective is a function of the max angle mat can be collected (or illiiminated) 
with the objective in a given refractive index of the media (£&, NA=n*sin(emax)). If Omax is larger 
than ©critic for reflection, some of the Ruminated rays will be totally internal reflected. Using the 
peripheral of a large NA objective, one can illuminate the sample with TIR through the objective and 
use the same objective to collect the fluorescence light That is, in certain embodiments the objective 
can play double roles as a condenser and an imaging objective. 

[0236] In certain embodiments, single molecule detection can be achieved using flow cytometry 
where flowing samples are passed through a focused laser with a spatial filter used to define a small 
volume. U.S. Pat No. 4,979,824 describes a device for this purpose. U.S. Pat No. 4,793,705 
describes a detection system for identifying individual molecules in a flow train of the particles in a 
flow cell It further describes methods of arranging a plurality of lasers, fluorescence filters and 
detectors for detecting different fluorescent nucleic acid base-specific labels, U.S. Pat No. 4,962,037 
also describes a method for detecting an ordered train of labeled nucleotides for obtaining DNA and 
RNA sequences using an exonuclease to cleave the bases. Single molecule detection on solid 
supports is also described in Ishikawa, et al. (1994). Single-molecule detection by laser-u 
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fluorescence technique with a position-sensitive photon-counting apparatus is also deactibed, Jan. J 
Apple. Fhys. 33:1571-1576. Ishikawa describes a typical apparatus involving a photon-counting 
camera system attached to a fluorescence microscope. Lee et al (Anal. Chenx, 66:4142-4149, 1994) 
describes an apparatus for detecting single molecules in a quartz capillary tube. Hie selection of 
5 lasers is dependent on the labeling moiety and the quality of light required. For example, diode, 
helium neon, argon ion, argon-krypton mixed ion, and double Nd:YAG lasers are useful in this 
invention. 

C. Excitation and Scanning 

[0237] In some embodiments, fluorescent excitation is exerted with a Q-switched frequency 
1 0 doubled Nd YAG laser, which has a KHz repetition rate allowing many samples to be taken per 
second. For example, a wavelength of about 532 nm is ideal for the excitation of rhodamine. It is a 
standard device that has been used in the single molecule detection scheme (Smith et al., Science 
253:1 122, 1992), Further, a pulsed laser allows time resolved experiments, which are useful for 
rejecting extraneous noise. In some embodiments, excitation can be performed with a mercury lamp 
1 5 and signals from the incorporated nucleotides can be detected with a CCD camera (see, e.g., Unger et 
al., Biotechniques 27:1008, 1999). 

[0238] In some embodiments, the scanning system may be able to reproducibly scan the substrate 
(eg., synthesis channels in the apparatuses). Where appropriate, e.g., for a two dimensional substrate, 
the scanning system may positionally define the templates or primers attached thereon to a 
20 reproducible coordinate system. Positional identification may be repearable in successive scan steps, 
allowing correlation of the positions of identified signals. 

[0239] Incorporated signals can be detected by scanning the substrates or the synthesis channels. 
The substrates or synthesis channels can be scanned simultaneously or serially, depending on the 
scanning method used The signals can be scanned using a CCD camera (TE/CCD512SF, Princeton 

25 Instruments, Trenton, N J.) with suitable optics (Ploem, J. S., in Fluorescent and Luminescent Probes 
for Biological Activity, Mason, T. W., Ed., Academic Press, London, pp. 1-1 1, 1993), such as 
described in Yershov et al. (Proc. Natl. Acad. Sci. 93:4913, 1996), or can be imaged by TV 
monitoring (Khrapko et al., DNA Sequencing 1:375, 1991). For radioactive signals (e.g,, ^P), a 
phosphorimager device can be used (Johnston et al., Johnston R. F., et al., Electrophoresis 11:355, 

30 1990; and Drmanac et aL, Drmanac, It, et al, Electrophoresis 13:566, 1992). These methods are 
particularly useful to achieve simultaneous scanning of multiple probe-regions. 
[0240] Various scanning systems can be employed in the methods and apparatus of the present 
invention. For example, electro-optical scanning devices described in, e.g. , U.S. Pat No. 5,143,854, 
are suitable for use with the present invention. The system could exhibit many of the features of 

35 photographic scanners, digitizers or even compact disk reading devices. For example, a model no. 
PM500-A1 x-y translation table manufactured by Newport Corporation can be attached to a detector 
unit The x-y translation table is connected to and controlled by an appropriately programmed digital 
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computer such as an IBM PC/AT or AT compatible computer. Hie detection system can be a model 
no. R943-02 photomultiplier tube manufactured by Hamamatsu, attached to a preamplifier, e.g. t a 
model no. SR440 manufactured by Stanford Research Systems, and to a photon counter, e.g. , an 
SR430 manufactured by Stanford Research System, or a multichannel detection device. Either digital 

5 or analog signals may be advantageous in different embodiments or aspects of the invention. 

[0241] Hie stability and reproducibility of positional localization in scanning of the invention may 
determine the resolution for detecting closely positioned polynucleotide clusters on a two-dimensional 
substrate. High resolution scanning, for example, allows successive monitoring at a given position, 
mapping the results of repeated reaction cycles to one or more positionally-mapped polynucleotides or 

10 polynucleotide complexes. As the resolution increases, the number of possible polynucleotides that 
can be sequenced on a single substrate also increases. Crude scanning systems can resolve only on 
the order of about 1000 urn, refined scanning systems can resolve on the order of about 100 urn, more 
refined systems can resolve on the order of about 10 urn, and with optical magnification systems a 
resolution on the order of about 1.0 um is available. The resolution limit can depend on diffraction 

1 5 limits and advantages can arise from using shorter wavelength radiation for fluorescent scanning 

steps. However, with increased resolution, the time required to fully scan a substrate can increase and 
a compromise between speed and resolution may be selected. Parallel detection devices, which 
provide high resolution with shorter scan-times, are applicable, for example, where multiple detectors 
are moved in parallel. 

20 [0242] In some applications, sensitivity may be more important than resolution. However, the 

reliability of a signal can be pre-selected by continuing to count photons for longer periods of time at 
positions where intensity of signal is lower. Although this may decrease scan speed, it can increase 
reliability of the signal determined. Various signal detection and processing algorithms can be 
incorporated into the detection system. In some embodiments, the distribution of signal intensities of 

25 pixels across the region is evaluated to determine whether the distribution of intensities corresponds to 
a time positive signal. 

[0243] In some embodiments, detecting correlates intensity with the number of incorporated 
nucleotides. For example, by measuring increase of fluorescence as nucleotides are incorporated and 
quantifying the increase, the number of nucleotides bearing a given fluorescent moiety may be 
30 calculated. 

D, Sample Detection of Fluorescent Labeling Moieties 

[0244] Briefly, the polynucleotide templates can be prepared as described above (e.g., cloned in 
single-stranded Ml 3 plasrrrid, biotinylated, and attached to the surface of a substrate, e.g., the surface 
of a synthesis channel, which has been pre treated using the PEM technique). After the primed, single 
35 stranded DNA is anchored to the substrate, e.g. to the synthesis channel in the flow cell, a 

polymerizing agent and a nucleotide, e.g. dATP, may be flowed into the flow celL A high fidelity 
polymerizing agent with no exonuclease proofreading ability can be used. If the first base of the 
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DNA sequence following the primer is T, then the polymerizing incorporates the dATP's bearing 
fluorescent moieties as labels. If the first base is anything else, no fluorescent molecules become 
incorporated. The reagents can then be flowed out of the flow cell, and the fluorescence of the 
polynucleotide measured If no fluorescence is detected, the procedure can be repeated with one of 
5 the other nucleotides. If fluorescence is detected, the identity of the first base in the sequence has 
been determined. The fluorescence can be excited with, e.&, a Q-switched frequency doubled Nd 
YAG laser (Smith et aL, Science 253: 1 122, 1992). 

[0245] In certain embodiments, each of the nucleotides employed has a detectably-distinguishable 
fluorophore associated with it In such embodiments, a four-color instrument can be used having four 

1 0 cameras and four excitation lasers or the image could be split to four quarters and imaged by a single 
camera. For example, the micro-imager of Optical Insights LTD can split the image to four different 
images in four different spectra in front of the port of the camera. Illumination with only one laser 
excitation for four colors is possible if suitable dyes are used (see, e.g., Rosenblum et al, Nucleic 
Acids Research 25:4500, 1997). For example, the BigDyes, available from Applied Biosciences, 

1 5 have single excitation wavelength spectrum and four different emission wavelength spectrums. 
(http://www.apphedbiosystems.com/pro Nanocrystals also have a 

variety of emission wavelengths for a given excitation (see, e.g,, U.S. Pat No. 6,309,701; and Lacoste 
et al., Proc. Natl. Acad. Sci. USA 97: 9461-6, 2000). Thus, it is possible to use such optical setups to 
analyze a sequence of a polynucleotide. Moreover, many different polynucleotide molecules 

20 immobilized on a substrate (e.g., a microscope slide) can be imaged and sequenced simultaneously. 
[0246] In certain embodiments, the substrates (or, e.g. , the synthesis channels) can be serially 
scanned one by one, or row by row using a fluorescence microscope apparatus, such as described in 
U.S. Patent Nos. 6,094,274, 5,902,723, 5,424,186, and 5,091,652. In some embodiments, standard 
lowlight level cameras, such as a SIT and image intensified CCD camera, are employed (see, Funatsu 

25 et al., Nature 374, 555, 1995). An ICCD can be preferable to a cooled CCD camera because of its 
better time resolution. These devices are commercially available (eg., from Hammamatsu). 
[0247] Alternatively, only the intensifier unit from Hammamatsu or DEP may be used and 
incorporated into other less expensive or home built cameras. If necessary, the intensifier can be 
cooled. A customarily-built camera can allow greater flexibility in component-choice in a higher 

30 performance device. Using a camera instead of an avalanche photodiode can provide the advantage of 
imaging the whole field of view. This extra spatial information allows the development of new noise 
reduction techniques. For example, one can use the fact that signals are expected from certain spatial 
locations (Le. where the polynucleotide template is attached) in order to reject noise. 
[0248] In some embodiments, polynucleotide sequences are analyzed with a fluorescent 

35 photobleaching method. Fluorescentry labeled nucleotides can be used in the primer extension, and 
signals from the incorporated nucleotides can be removed by photobleaching before the next 
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extension cycle. That is, the fluorescence signal can be photobleached and in some cases 
extinguished before the procedure is repeated for the next base in the template sequence. 
[0249] In certain embodiments, only a fraction of each type of nucleoside triphosphate is 
fluorescently labeled. That is, only a fraction (e.g., less than about 10%, about 5%, about 1%, about 
5 0.1%, about 0.01%, or about 0.001%) of each type of nucleotide triphosphate may be fluorescently 
labeled (e.g., ihodamine-labeled nucleotide triphosphates from NEN DuPont can be used). 
E, Detection using Fluorescence Resonance Energy Transfer (FRET) 
[0250] In some embodiments of the present invention, incorporation of different types of 
nucleotides into a primer is detected using different fluorescent labeling moieties on the different 

10 types of nucleotides. One class of fluorescent dyes which has been developed is the fluorescence 
resonance energy transfer (FRET) dyes, including donor and acceptor energy fluorescent dyes and 
linkers useful for DNA sequencing. When two different labels are incorporated into the primer in 
close vicinity, signals due to fluorescence resonance energy transfer (FRET) can be detected. FRET 
is a phenomenon that has been well documented in the literature, e.g. , in T. Foster, Modem Quantum 

1 5 Chemistry, Istanbul Lectures, Part HI, 93-137, 1965, Academic Press, New York; and SeJvin, 
"Fluorescence Resonance Energy Transfer," Methods in Enzymology 246: 300-335, 1995. 
[0251] In FRET, one of the fluorophores (donor) has an emission spectrum that overlaps the 
excitation spectrum of the other fluorophore (acceptor) and transfer of energy takes place from the 
donor to the acceptor through fluorescence resonance energy transfer. The energy transfer is mediated 

20 by dipole-dipole interaction. Spectroscopically, the acceptor moiety is a fluorophore which is excited 
at the wavelength of light emitted by the excited donor moiety. When excited, the donor moiety 
transmits its energy to the acceptor moiety. Therefore, emission from the donor is not observed. 
Rather, emission from the donor excites the acceptor, causing the acceptor to emit at its characteristic 
wavelength (i.e., a wavelength different from that of the donor and observed as a different color from 

25 that of die donor). 

[0252] In FRET, when the donor is excited, its specific emission intensity decreases while the 
acceptor's specific emission intensity increases, resulting in fluorescence enhancement. Also, 
attachment of acceptor moieties with differing emission spectra allow differentiation among different 
nucleotide base-types by fluorescence using a single excitation wavelength. 

30 [0253] Moreover, the donor excites acceptors only within the Foster radius of a given FRET pair, 
thus creating a highly localized excitation source and reducing background noise from moieties 
outside this Foster radius. For example, FRET signals can be detected from individual 
polynucleotides when a donor-acceptor pair are incorporated into die same target polynucleotide- 
primer complex. 

35 [0254] FRET pairs can be chosen to have a given Foster radius, for example, about 1 nm, about 2 
nm, about 3 ran, about 4 nm, about 5 ran, about 6 nm, about 7 nm, about 8 nm, about 9 nm, or about 
10 nm. Noise from any non-specific attachment of fluorescently-labeled nucleotides to the surface of 
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the substrate can become small, as the effective region of fluorescent illumination will only be a few 
nanometers. Furthennore, for photo bleaching, bleaching radiation may be selected to extinguish 
signal fiom the acceptor but not the donor, fecilitating repeated used of the same donor moiety with 
different acceptor moieties, as described above. 

[0255J Detection of single molecule FRET signal reveals sequence information and facilitates 
interpretation of the sequencing data. Detection of FRET signal in the present invention can be 
performed accordingly to various methods described in the art U.S. Pat No. 5,776,782). FRET 
has been used to study various biological activities of biomacromolecules including polynucleotides. 
For example, Cooper et al disclosed fluorescence energy transfer in duplex and branched DNA 
molecules (Biochemistry 29: 9261-9268, 1990). LazowsM et al. reported higily sensitive detection of 
hybridization of oUgonucleotides to specific sequences of nucleic acids by FRET (Antisense Nucleic 
Acid Drug Dev. 10: 97-103, 2000). Methods for nucleic acid analysis using FRET were also 
described in U.S. Pat. Nos. 6,177,249 and 5,945,283. Efficacy of using FRET to detect multiple 
nucleotides incorporation into a single polynucleotide molecule is exemplified in Example 8 of the 
present application. 

10256] Any of a number of fluorophore combinations can be selected as donor-acceptor pair for 
labeling the nucleotides in the present invention for detection using FRET signals (see for example, 
Pesce et al,. eds, Fluorescence Spectroscopy, Marcel Dekker, New York, 1971; White et al., 
Fluorescence Analysis: A practical Approach, Marcel Dekker, New York, 1970; Handbook of 
Fluorescent Probes and Research Chemicals, 6th Ed, Molecular Probes, Inc., Eugene, Oreg., 1996; 
which are incorporated by reference). In general, a donor fluorophore is selected that has a substantial 
spectrum overlap with that of the acceptor fluorophore. That is, the acceptor fluorophore' s excitation 
spectrum can substantially overlap the emission spectrum of the donor fluorophore. Furthermore, it 
may also be desirable in certain applications that the donor have an excitation maximum near a laser 
frequency such as Hehum-Cadmrum 442 nm or Argon 488 nnx In such applications, the use of 
intense laser ligjit can serve as an effective means to excite the donor fluorophore. Moreover, the 
wavelength maximum of the emission spectrum of the acceptor moiety can be at least about 10 nm 
greater than the wavelength maximum of the excitation spectrum of the donor moiety. That is, the 
emission spectrum of the acceptor fluorophore can overlap with and be shifted compared to the donor 
spectrum. 

{0257J Suitable donors and acceptors operating on the principle of fluorescence energy transfer 
include, but are not limited to, 4-acctanu^icMr-iso acid; acridine and 

derivatives: acridine, acridine isothiocyanate; 5<2'-^ininoethyl)aminonaphrh^ 1 -sulfonic acid 
(EDANS); 4-anrino-N^3-vinylsu^ disulfonate; N-(4-anilino-l- 

naph%I)maleimide; anthranilamide; BODIPY; Brilliant Yellow; coumarin and derivatives: 
coumarm, 7-annno^-methylcoumarrn (AMC, Coumarin l2Q)J-wmo4^uorome1hylco\Auann 
(Coumaran 151); cyanine dyes; cyanosine; 4\6*tianMdmo-2^hmy)mMe (DAPI); 5', 5"- 
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dibromopyrogallol-sulfon^)hthaleni (Bromopyrogallol Red); 7-dietfaylamino-3 -(4'- 
isoihiocyaimtophenyl)^-methylw}iimarin; diethylenetrianiine pentaacetate; 4,4'- 
diisothioc^anatodihydro-stilben^^ '-disulfanic acid; 4,4'Kliisothioc^atDStiIbene-2^'^sulfcmic 
acid; 5 -[diraethyIamino]naphthalene- 1 -sulfonyl chloride (DNS, dansylchloride); 4- 
5 dimethylaminophenylazoph^ (DABITC); eosin and derivatives: eosin, eosin 

isothiocyanate, erythrosin and derivatives: erythrosin B, erytoosin, isolhiocyanate; ethidium; 
fluorescein and derivatives: 5-carboxyfluorescein (FAM),5^4,6Khchlorotria2in-2-yl)aminofluoi^cein 
(DTAF), 2^7'-dimethoxy^'5-diclJoro^-carboxyfluorescem (JOE), fluorescein, fluorescein 
isothiocyanate, QFTTC, (XRITC); fluorescamine; IR144; IR1446; Malachite Green isothiocyanate; 4- 

10 methylurabelliferoneortho cresolphthalein; nitrotyrosine; pararosaniline; Phenol Red; B- 

phycoerythrin; o-phthaldialdehyde; pyrene and derivatives: pyrene, pyrene butyrate, succinirnidyl 1- 
pyrene; butyrate quantum dots; Reactive Red 4 (Cibacron™ Brilliant Red 3B-A) rhodamine and 
derivatives: 6-carboxy-X-rhodamine (ROX), 6-carboxyrhodaniine (R6G), lissamine rhodamine B 
sulfonyl chloride rhodamine (Rhod), rhodarnine B, rhodamine 123, rhodamine X isothiocyanate, 

15 sulforhodamine B, sulforhodamine 101, sulfonyl chloride derivative of sulforhodamine 101 (Texas 
Red); N^^N'^trame%l-6<aiboxyihodamine (TAMRA); tetramethyl rhodamine; tetramethyl 
rhodamine isothiocyanate (TRITC); riboflavin; rosolic acid; terbium chelate derivatives; Cy 3; Cy5; 
Cy5.5; Cy7; IRD 700; IRD 800; La Jolla Blue; phlhalo cyanine; and naphthalo cyanine. 
[0258] In certain embodiments, the donor fluorescent moiety is coupled to the primer, and energy 

20 is detected from acceptors on nucleotides as they are incorporated into the extending primer. Other 
detecting techniques identifying interaction or correlation between a labeling moiety on a primer and a 
labeling moiety on a nucleotide may also be used. 

[0259] In certain embodiments, the donor fluorescent moiety is coupled to the polymerizing agent, 
and energy is detected from acceptors on incorporated nucleotides. Other detecting techniques 
25 identifying interaction or correlation between a labeling moiety on a polymerizing agent and a 
labeling moiety on a nucleotide may also be used 

[0260] Another approach to reducing background involves "turning on" a labeling moiety as it 
becomes incorporation into the complementary strand For example, some embodiments use 
nucleotides comprising a fluorescent labeling moiety and a quenching moiety. Locating the 

30 quenching moiety on the p- or y- phosphate of a nucleotide triphosphate quenches fluorescence from 
incorporated nucleotides, while allowing fluorescence from incorporated nucleotides. This makes 
use of the chemistry of nucleotide incorporation, in which the {J- and y-phosphates of a nucleotide 
triphosphate are released during the incorporation reaction as pyrophosphate, to "turn on" the labeling 
moiety on incorporated nucleotides. 

35 [0261] Additional techniques may be used to suppress background interference and/or improve 
detection of fluorescent labels. These include, for example, spectral wavelength discrimination and 
fluorophore identification. Further, increases or decreases in fluorescent intensity may be measured 
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d quantified, to correlate signal intensity with the number of incorporated nucleotides. C< 
embodiments can utilize additional visualization techniques, including for example single and/or 
multiphoton excitation, light scattering, dark field microscopy, and/or photoconversion. In yet other 
embodiments, detection can be carried out by non-optical and/or electronic procedures, as outlined 
below. 

F. Quantum Dots 

[0262] Another means of detection involves using quantum dots as the labeling moiety. A 
quantum dot is a nano scale metal or semi-conductor particle. A quantum dot can be made to 
fluoresce in various colors for days, months, and perhaps years. 

http^/www.sciencenews.org^20030215/bobl0.asp. In some embodiments, the semiconductor 
particles are made of a cadmium selenide core surrounded by a shell of, for example, zinc sulfide, 
silicon, or polymer. Upon excitation with a light source, a quantum dot emits a particular color based 
on its size, where smaller dots fluoresce at shorter wavelengths (e.g. , blue wavelengths) and bigger 
dots emit longer wavelengths (eg. red wavelengths). 

[0263] Quantum dot diameters can be about 1 nm, about 2 nm, about 3 run, about 4 nm, about 5 
nm, about 6 nm, about 7 nm, about 8 nm, about 9 nm, about 10 nm, about 1 1 nm, about 12 nm, about 
13 nm, about 14 nm, and about 15 nm. Different dot sizes may be used to create detectably 
distinguishable labeling moieties for attaching, for example, to different nucleotide base-types. 
Further, intensity can increase proportionally with the number of dots, permitting correlation of the 
signal intensity with the number of incorporated nucleotides. 

[0264] The quantum dot may be attached to any position on the nucleotide base, sugar, and/or Di- 
phosphate, with or without a linker, where the nucleotide bearing the dot remains capable of base- 
complementary incorporation by a polymerizing agent into a growing complementary strand. 

G. Non-Optical Detection 

[0265] Other than fluorescently-labeled nucleotides and optical detection devices, other methods of 
detecting nucleotide incorporation are also contemplated in the present invention, eg,, in bulk 
sequencing applications, including the use of mass spectrometry to analyze the reaction products, the 
use of radiolabeled nucleotides, as well as electronic means, the detection of reaction products using 
"wired enzymes", and reactive labeling moieties and/or enzymatic labeling moieties. 
[0266] In some embodiments, mass spectrometry is employed to detect nucleotide incorporation in 
the primer extension reaction. A primer extension reaction generally consumes a nucleotide 
triphosphate, adds a single base to the primer/template complex, and produces pyrophosphate as a by- 
product Mass spectrometry can be used to detect released pyrophosphate, after providing one or 
more nucleotides in the presence of the template and a polymerizing agent The absence of 
pyrophosphate indicates mat the nucleotide was not incorporated, whereas the presence of 
pyrophosphate indicates incorporation. Detections based on pyrophosphate release have been 
described in the art, e.g., in W098/13523, WO98/28440, and Ronaghi et al, Science 281363, 1998. 
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[0267] Certain embodiments use radiolabeled nucleotides. Nucleotides can be radiolabeled either 
at the sugar, the base, and/or the phosphate groups. To detect radioactivity, a small radioactivity 
sensor can be incorporated in the substrate. A CCD pixel, for instance, serves as a good detector for 
some radioactive decay processes. Radiolabeling of the sugar and/or base produces an additive 
5 signal: each incorporation increases the amount of radiolabel in the primer-template complex. If the 
nucleotide is labeled in the portion that is released as pyrophosphate dTSTP labeled with p- or y- 

32 • 

P), the radioactive pyrophosphate can be detected, for example in the wash stream. This 
radioactivity level need not be additive, but rather can be binary for each attempted nucleotide 
addition. Consequently, subsequent additions may pose no limit on the read length. Due to the small 
1 0 reagent consumption and the contained nature of microfluidics, the total radioactivity used in such a 
system may be relatively rninimal, and containment relatively simple. 

[0268] Certain emoodiments detect incorporation electronically. Electronic procedures, include, 
for example, the use of sensitive electronic DNA detectors, such as ones developed by NASA Ames 
Research Center, which employ a forest of carbon nanotubes to sense small amounts of 

15 polynucleotides. See,e#, 

http://www.trnniag.^ The sensitivity 

of such a device is based on its small size and the electronic properties of carbon nanotubes. For 
example, the device uses arrays of about 2- to about 200-square-rnicron chromium electrodes on a 
silicon wafer. Multi-walled nanotubes ranging from about 30 to about 50 nanometers are packed onto 

20 the electrodes at densities of anywhere from about 100 million to about 3 billion nanotubes per squai e 
centimeter. One end the nanotube contacts the electrode and the other is exposed at the surface where 
target polynucleotides can be attached. Addition of complementary bases can increase the flow of 
electrons through the nanotubes to the electrode. In some embodiments, the device may be sensitive 
enough to detect a few million to a few thousand polynucleotide molecules and can be used in the 

25 practice of certain embodiments of the present invention to detect single base extension. 

[0269] Other electronic means of detecting polynucleotides have also been described. For 
example, Firtz et at, Electronic detection of DNA by its intrinsic molecular charge, FNAS 2002, have 
reported selective and real-time detection of DNA using an electronic readout In such embodiments, 
microfabricated silicon field-effect sensors are used to directly monitor the increase in surface charge 

30 when polynucleotide strands hybridizes on the sensor surface. Nanomolar polynucleotide 
concentrations can be detected, for example, in bulk sequencing by synthesis applications. 
[0270] Some embodiments using non-optical detection of pyrophosphate release make use of 
"wired redox enzymes" as described, in Heller et al., Analytical Chemistry 66:245 1 2457, 1994; 
and Ohara et al., Analytical Chemistry 65:3512-3517, 1993. Briefly, enzymes can be covalently 

35 linked to a hydrogel matrix containing redox active groups capable of transporting charge. The 
analyte to be detected is either acted on directly by a redox enzyme (either releasing or consuming 
electrons) or consumed as a reagent in an enzymatic cascade mat produces a substrate mat is reduced 
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or oxidized by a redox enzyme . The production or consumption of electrons is detected at a metal 
electrode in contact with the hydrogel. For the detection of pyrophosphate, an enzymatic cascade 
using pyrophosphatase, maltose phosphorylase, and glucose oxidase can be employed 
Pyrophosphatase converts pyrophosphate into phosphate; maltose phosphorylase converts maltose (in 
5 the presence of phosphate) to glucose 1 -phosphate and glucose. Then, glucose oxidase converts the 
glucose to gluconoiactone and H 2 0 2 ; this final reaction is the redox step which gives rise to a 
detectable current at the electrode. Glucose sensors based on this principle are well known in the art, 
and enzymatic cascades as described here have been demonstrated previously. Other enzymatic 
cascades besides the specific example given here are also contemplated in the present invention. This 
1 0 type of detection scheme allows direct electrical readout of nucleotide incorporation at each reaction 
chamber or location, allowing easy parallelization. 

[0271] As outlined above, some embodiments use nucleotides comprising a reactive moiety that 
can undergo a reaction, for exanq>le, following incorporation, to create a detectable product. In such 
embodiments, detection of the product can identify incorporation of the nucleotide. Reactive moieties 

1 5 include, for example, biotin as in biotin-dUTP; digioxin, as in digioxin-dUTP; fluorescein, as in 

fluorescein-dUTP; and the like. Such reactive moieties bind to a corresponding member of a binding 
pair, which is itself conjugated to an enzymatic moiety that produces a detectable reaction product 
For example, biotin-dUTP can bind horse peroxidase-conjugated streptavidin; digioxin-dUTP can 
bind horseradish peroxidase-conjugated antidigoxin; and fluorescein-dUTP can bind alkaline 

20 phosphatase-conjugated anti-fluorescent antibody. Additional enzymatic moieties include 

galactosidase, luciferase, or acetylcholinesterase. Standard methods are known in the art for detecting 
reaction products of these enzymatic moieties. Moreover, in certain other embodiments, the 
enzymatic moiety can be attached to the nucleotide itself, and similarly detected by production of a 
reaction product 

25 VH. Modes of Analysis 
A. Movie Mode 

[0272] Certain embodiments of the present invention involve visualizing incorporation of labeled 
nucleotides into immobilized polynucleotide molecules in a time resolved manner, with single 
molecule resolution. This involves a dynamic rather than a static approach to sequence analysis, 

30 where the dynamic approach is termed "movie mode" 

[0273] The present invention allows both static and dynamic approaches. Hie static approach 
involves adding just one type of nucleotide bearing a labeling moiety to the polymerization reaction at 
any given time. The signal is incorporated into the primer if the next template residue in fee target 
polynucleotide is the complementary type. This may be repeated with each of the other three types of 

35 nucleotides until the correct residue is incorporated. 

[0274] In the dynamic approach, all four types of nucleotides (with at least one type bearing a 
labeling moiety) are simultaneously present, and incorporation oftiie signals into the complementary 
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strand is monitored temporally. For example, incorporated signals are imaged continuously, 
preferably at a rate faster than the rate at which the nucleotides are incorporated into the primer. As 
the polymerizing agent continues along the target polynucleotide, the polynucleotide sequence can be 
determined from the temporal order of the incorporated labeling moieties Into the growing 
5 complementary strand 

[0275] In some embodiments, multiple types of labeled nucleotides (e.g., 2 to 4 types each labeled 
with a different labeling moiety) can be added at the same time for the extension reactions. For 
example, polynucleotide sequence analysis can be accomplished by using four different labeling 
moieties on each of the four types of nucleotides. Incorporated signals are imaged and then optionally 
1 0 neutralized before further incorporation cycles. Runs of identical bases (eg. , AAAAA) can be 
identified by, for example, monitoring the intensity of the signal so that the number of labels at an 
emitting spot can be quantitatively determined. 

[0276] Certain embodiments use fewer than four types of labeling moieties and less than all four of 
the nucleotides are labeled. In some embodiments, for example, only one type of labeled nucleotide is 

1 5 added at a step, and each extension cycle may comprise four such steps in order to observe the 

incorporation at the next complementary nucleotide. Alternatively, two types of nucleotides can be 
labeled with the same or detectably distinguishable labeling moieties. By repeating the experiment 
with different pairs AT, AG, AC, TG, TC, GQ, the original nucleotide sequence can be 
delineated. Similarly, three types of nucleotides can be labeled with the same or detectably 

20 distinguishable labels. 

[0277] Certain embodiments use fewer than four types of labeling moieties, but all fourW the 
nucleotides are labeled. For example, using three different labeling moieties, each of three types of 
nucleotides can bear a detectably distinguishable labeling moiety, and the fourth type can bear the 
same labeling moiety as one of the other three types. In such embodiments, the analysis would need 

25 to be repeated at least twice to determine the sequence of the target polynucleotide, while repeating 
three times would increase accuracy. Alternatively, using two different labeling moieties, one of the 
four types of nucleotides can bear one labeling moiety detectably distinguishable from the second 
labeling moiety used on the other three types of nucleotides. In such embodiments, die analysis 
would need to be repeated at least three times to determine the sequence of the target polynucleotide, 

30 while repeating four times would increase accuracy. 

[0278] Certain embodiments of the present invention are also useful in obtaining partial sequence 
information of a target polynucleotide, eg. , by using only two or three labeled nucleotide species. 
The relative positions of two or three nucleotide species in die sequence in conjunction with known 
sequence databases can facilitate determination of the identity of the target sequence, i.e., whether it is 

35 identical or related to a known sequence. For example, only two detectably distinguishable labeling 
moieties can be used to identify already-sequenced regions. As an illustration, out of a known 
universe of UNA transcripts, two colors would provide color patterns allowing identification. Such 



» 



WO 2005/080605 



PCT/US2005/004258 



10 



15 



20 



25 



30 



35 



-56- 



approaches are useful, for example, in determining gene expressions by re-sequencing RNA 
transcripts and cDNA libraries. Such approaches are also useful in detecting mutations, such as SNPs 
or cancer mutations, in known genomic sequences. 
B. Bulk Analysis 

[0279] Certain embodiments of the present invention are directed to bulk analysis of a plurality of 
target polynucleotides in parallel, where the incorporation/extension reaction is performed with 
multiple copies of the template polynucleotide. For example, the experiment may involve 
simultaneously analyzing the sequences of a plurality of copies of the same or different target 
polynucleotides at a plurality of different locations on an array. 
C* Asynchronous and Short-Cycle Sequencing 

[0280] Another aspect of the present invention features the advantages of asynchronous 
sequencing. As the invention involves sequencing at the single molecule level, there is no need to 
average information from many different targets. Thus, in some embodiments as illustrated in Figure 
17, if an incorporation reaction fails to occur on a particular target polynucleotide, it can be completed 
in a later cycle without producing erroneous information, or interfering with data from other target 
molecules being analyzed in parallel. Some embodiments feature a method of analyzing a sequence 
of target polynucleotides by allowing incorporation of nucleotides into complementary strands, where 
different numbers of nucleotides may be incorporated into different complementary strands in a given 
period of time. Later, a nucleotide that was not incorporated into at least one of the strands 
previously, but that subsequently becomes incorporated, can be identified. That is, a nucleotide that 
failed to be incorporated on a particular target at a given time can "catch up" later without adversely 
affecting sequencing information. 

[0281] The example illustrated in Figure 17 indicates asynchronous incorporation into two copies 
of a given target polynucleotide. A cytosine ("C") incorporates into the extension product of one 
copy of a target polynucleotide, but fails to incorporate into the other copy. During subsequent cycles 
of incorporation, however, a C can become incorporated, without adversely affecting sequencing 
information. Hence, it does not matter if an incorporation is missed now and then. 
[0282] Asynchronous incorporation also overcomes the need to run a cycle of incorporation to 
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complementary strand of a given number of bases, conventional chemistry teaches one to run each 
incorporation reaction to as close to completion as possible to improve yield. For example, 
nucleotides may be allowed to react in the presence of a polymerizing agent until at least one becomes 
incorporated into at least 99% of the complementary strands. This would produce a yield of (0.99)° x 
100% for a complementary strand extended by n nucleotides. Figure 18 illustrates that obtaining 
incorporation in 99% of the complementary strands, however, requires a period of several half-lives 
of the incorporation reaction, where one half-life is the time taken for at least one incorporation to 
occur in 50% of the complementary strands. Classically, the more strands that complete an 
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incorporation during each cycle, the more n-mers 
asynchronous incorporation, an incorporation that foiled to occur on a particular target in one cycle 
can "catch up" in later cycles, permitting the use of shorter, even if more numerous, cycles. 
[0283] Accordingly, another aspect of the present invention features a short-cycle sequencing 
method for analyzing a sequence of a target polynucleotide. Certain embodiments involve allowing a 
cycle of incorporation reactions of a number of nucleotides into a complementary strand, halting the 
cycle after a relatively short period of time, and detecting incorporation. In such embodiments, 
halting occurs when only a small proportion of the stands have been extended, or when a large 
proportion of the strands have only been extended by a few nucleotides. For example, the cycle 
period may permit some chance of incorporation of two or less nucleotides into a given 
complementary strand. The cycle period may be conveniently measured in half lives of the 
incorporation reaction, for example, a period of less than one to a few half lives. Halting may be 
carried out by washing or flushing out the nucleotides that remain unincorporated and/or washing or 
flushing out polymerization agent The method can be repeated for a number of short cycles to 
sequence additional nucleotides of the target polynucleotide by short-cycle sequencing. Further, 
many aspects of the repeated cycles may be automated, for example, using microfluidics for washing 
nucleotides to sites of anchored target polynucleotides, and washing out unincorporated nucleotides to 
halt each cycle. 

[0284] In certain embodiments, the target polynucleotide comprises a homopolymer stretch of 
consecutive repeats of a given nucleotide base (e.g. AAAAAAAAAAA). m certain embodiments, 
nucleotides of the same type bear the same labeling moiety {e.g. all A's carry a red fluorescent dye). 
A long repeat of the same incorporated signal can be read in short-cycle sequencing as only a few 
nucleotides will be incorporated during each cycle. Signal from the few incorporated nucleotides can 
be detected and neutralized and/or reduced before subsequent cycles are carried out. Signals can be 
removed after each cycle or after a number of cycles, for example, after a number of cycles that would 
result in too many incorporated nucleotides for quantification. 

[0285] In some embodiments, signal is reduced by bleaching, including chemical bleaching and 
photobleaching. In some embodiments, signal is reduced by removing all or a portion of the labeling 
moiety from incorporated nucleotides. The portion removed may be the signal generating portion. 
Removal may involve cleaving by chemical, enzymatic or photo-chemical means. Removing can be 
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carried out after about one, about two, about three, about four, or about five cycles, dep 
example, on the number of nucleotides allowed to be incorporated per cycle and the ability of the 
detection means used to distinguish between increasing numbers of incorporated labeling moieties. 
[0286] It will be appreciated that short-cycle sequencing can overcome problems of reading 
homopolymer stretches in sequencing by synthesis methods, without using chain termination nor 
blocking moieties, such as chain elongation inhibitors. While detection techniques may be able to 
quantify signal intensity from a smaller number of incorporated nucleotides of the same base-tvoe. for 
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example two or three incorporated nucleotides, longer runs of identical bases may not permit 
quantification due to increasing signal intensity. That is, it may become difficult to distinguish n 
bases from n+1 bases, where the fractional increase in signal intensity from the (n+l)* base is small 
relative to the signal intensity from the already-incorporated n bases. 
5 [0287] In embodiments using short-cycles, however, it is possible to limit the number of 
nucleotides that become incorporated in a given cycle. For example, it can be determined by 
simulation that using a cycle period of about 0.8 half-lives can result in two or less incorporations in 
nine out of ten homopolymer complementary strands. (See Example 1 lb). In another simulation, a 
0.8 half-life period was shown to allow no more than two incorporations in about 96.0% of 200 

10 homopolymer complementary strands. As detection means can more readily quantify signal intensity 
from the smaller number of incorporated nucleotides rather than from larger numbers, the use of 
short-cycles addresses this issue. For example, imaging systems known in the art can reliably 
distinguish the difference in signal intensity between one versus two fluorescent labeling moieties on 
consecutively-incorporated nucleotides. Other imaging systems can reliably distinguish the difference 

15 in signal intensity between two versus three fluorescent labeling moieties on consecutively- 
incorporated nucleotides, 

[0288] Based on the methods disclosed herein, those of skill in the art will be able to determine the 
period of half-lives required to limit the number incorporations per cycle for a given number of target 
polynucleotides. (See Examples 11 and 12, Figures 19 and 20). Statistical simulations can also 

20 provide the number of repeated cycles needed to obtain a given number of incorporations, for 
example, to sequence a 25 base pair sequence. (See Examples U and 12, Figures 19 and 20). 
Referring to the simulations above, for example, it can be determined that 60 cycles, each 0.8 half- 
lives long, would be required for at least 25 incorporations in each of ten complementary strands 
(Example 1 lb, Figure 19b). With 200 complementary strands, 60 cycles each 0.8 half-lives long 

25 produce at least 20 incorporations in each strand (Example 12, Figure 20). Following the 
methodologies outlined herein, such as the simulated working examples detailed below, those of skill 
in the art will be able to make similar determinations for other numbers of targets of varying lengths, 
and use appropriate cycle periods and numbers of cycles to analyze homopolymer without using 
blocking moieties or reversible chain termination. 

30 [0289] In some embodiments, the half life for the incorporation reaction is affected by the fact that 
polymerizing agent may incorporate labeled nucleotides less readily than unlabeled nucleotides. 
Figure 21 illustrates the statistics of incorporation for a certain embodiment using a Klenow exo- 
minus polymerizing agent and Cy3- or Cy5- labeled nucleotides. The statistics show that 
polymerizing agent may incorporate repeated labeled nucleotides less readily than the first labeled 

35 nucleotide. That is, die first incorporation may take place more quickly than subsequent 
incorporations, which require a labeled base to be incorporated into a polynucleotide strand already 
containing an incorporated labeled base. Without being limited to any hypothesis, this may be due to 
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the polymerizing agent having difficulty incorporating labeled nucleotides "on top of an already 
incorporated labeled nucleotide. The graph of Figure 21 indicates, for example, that it may take five 
to ten times longer, resulting in a "slowing down" of the incorporation reaction. In other 
embodiments, the slowing down may vary with the use of other labeled nucleotides, other 
polymerizing agents and various reaction conditions. 

[0290] For example, the rate at which a polymerizing agent incorporates labeled nucleotides into a 
complementary strand may be slowed down by a factor of about 2, about 3, about 4, about 5, about 6, 
about 7, about 8, about 9, about 10, about 1 1, about 12, or about 15 times compared to that observed 
with unlabeled nucleotides or compared to that observed for the first incorporated labeled nucleotide. 
This "slowdown" can result in a longer half life for an incorporation reaction with a given 
homopolymer error rate. 

[0291] Moreover, this slowing down and longer half life can be taken into account when 
determining appropriate cycle periods and numbers of cycles to analyze homopolymer targets of a 
given length. Figures 22 and 23, for example, illustrate the results of Monte Carlos simulations 
accounting for these factors. The graph of Figure 23, for example, shows the number of cycles 
needed with cycle periods of various half lives, taking into account slowdown factors of two 
(squares), five (triangles), and 10 (crosses), in order to obtain over 25 incorporations in over 80% of 
target hompolymers, with at least a 97% chance of incorporating two or less nucleotides per cycle (or 
a smaller than 3% chance of incorporating three or more nucleotides per cycle). As the graph shows, 
longer half lives permit fewer cycles to obtain the desired result while keeping the error rate low. 
That is, the longer half lives for a given homopolymer error rate permit the use of longer cycle 
periods, allowing more nucleotides to be incorporated per cycle, and hence requiring fewer numbers 
of repeated cycles to analyze a target sequence of given length at a given error rate. For example, as 
Figure 23 illustrates, if the use of labeled nucleotides slows down polymerizing agent by a factor of 5, 
a cycle period of 2.4 half lives may be used to analyze over 80% of 25-mers in 30 cycles, where no 
more than two nucleotides incorporate over 97% of the time in any give cycle. 
[0292] Based on the instant disclosures, those of skill in the art can determine the cycle period 
required to limit the number incorporations per cycle for a given number of target polynucleotides for 
a given half life, and the number of cycles required to analyze a sequence of a given length. That is, 
following the methodologies, simulations, and graphs provided herein, those of skill in the art will be 
able to make similar determinations for numbers of target polynucleotides of varying lengths, and use 



>priate cycle periods and numbers of cycles for various half lives to analyze homopolymer 
sequences without using blocking moieties or reversible chain termination. 

[0293] For example, applying methods disclosed herein, the cycle period may be selected to permit 
about a 70%, about a 75%, about an 80%, about an 85%, about a 90%, about a 95%, about a 96%, 
about a 97%, about a 98%, and about a 99% chance of incorporation of two or less nucleotides into 
the complementary strand Omer cycle periods mat may be used in embodiments of the invention 
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include, for example, no more than about 5 half lives, no more than about 4 half lives, no more than 
about 3 half lives, no more than about 2 half lives, no more than about 1 half life, no more than about 
0.9 half lives, no more than about 0.8 half lives, no more than about 0.7 half lives, no more than about 
0.6 half lives, no more than about 0.5 half lives, no more than about 0.4 half lives, no more than about 
5 0.3 half lives, and no more than about 0.2 half lives of said incorporation reactions. 

[0294] The number of times the cycles are repeated can also be determined based on the methods 
described herein, to permit analysis of different numbers of target polynucleotides of varying lengths. 
The greater the length of sequence to be analyzed, and the shorter the cycle period used, the greater 
the number of times cycles will be repeated. Conversely, the greater the slowing down effect of 

10 incorporating labeled nucleotides, the longer the half life and the fewer the number of times cycles 
will be repeated For example, the number of times may be at least about one, at least about two, at 
least about three, at least about four, at least about five, at least about six, at least about seven, at least 
about eight, at least about nine, at least about 10, at least about 30, at least about 50, at least about 
100, at least about 500, at least about 1,000, at least about 5,000, at least about 10,000, at least about 

1 5 50,000, at least about 1 00,000, and at least about 500,000. 

[0295] Further examples of combinations of cycle periods and the number of times the cycles are . 
repeated that may be used in certain embodiments of the present invention include a cycle period of 
no more than about 1 half life, repeated at least about 40 times; a cycle period of no more than about 1 
half life, repeated at least about 50 times; a cycle period of no more than about 1 half life, repeated at 

20 least about 60 times; a cycle period of no more than about 1 half life, repeated at least about 70 times; 
a cycle period of no more than about 1 half life, repeated at least about 80 times; a cycle period of no 
more than about 0.9 half life, repeated at least about 40 times; a cycle period of no more than about 
0.9 half lives, repeated at least about 50 times; a cycle period of no more than about 0.9 half lives, 
repeated at least about 60 times; a cycle period of no more than about 0.9 half lives, repeated at least 

25 about 70 times; a cycle period of no more than about 0.9 half lives, repeated at least about 80 times; a 
cycle period of no more than about 0.8 half lives, repeated at least about 40 times; a cycle period of no 
more than about 0.8 half lives, repeated at least about 50 times; a cycle period of no more than about 
0.8 half lives, repeated at least about 60 times; a cycle period of no more than about 0.8 half lives, 
repeated at least about 70 times; a cycle period of no more than about 0.8 half lives, repeated at least 

30 about 80 times; a cycle period of no more than about 0.7 half lives, repeated at least about 40 times; a 
cycle period of no more than about 0.7 half lives, repeated at least about 50 times; a cycle period of no 
more than about 0.7 half lives, repeated at least about 60 times; a cycle period of no more ton about 
0.7 half lives, repeated at least about 70 times; a cycle period of no more than about 0.7 half lives, 
repeated at least about 80 times; a cycle period of no more than about 0.6 half lives, repeated at least 

3 5 about 40 times; a cycle period of no more than about 0.6 half lives, repeated at least about 50 times; a 
cycle period of no more than about 0.6 half lives, repeated at least about 60 times; a cycle period of no 
more than about 0.6 half lives, repeated at least about 70 times; a cycle period of no more than about 
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0.6 half lives, repeated at least about 80 times; a cycle period of no more than about 0.5 half lives, 
repeated at least about 40 times; a cycle period of no more than about 0.5 half lives, repeated at least 
about 50 times; a cycle period of no more than about 0.5 half lives, repeated at least about 60 times; a 
cycle period of no more than about 0.5 half lives, repeated at least about 70 tunes; and a cycle period 
of no more than about 0.5 half lives, repeated at least about 80 times. 

[0296] Taking into account various slowing down factors, examples of cycle periods and number 



repeat cycles that may be used in certain embodiments further include a cycle period of no more than 
about 0.5 half lives with a slowing down factor of about 2, repeated at least about 90 times; a cycle 
period of no more than about 0.75 half lives, with a slowing down factor of about 2, repeated at least 
about 75 times; a cycle period of no more than about 1 half life, with a slowing down factor of about 
2, repeated at least about 50 times; a cycle period of no more than about 1.5 half lives with a slowing 
down factor of about 2 or about 5, repeated at least about 45 times; a cycle period of no more than 
about 1 .75 half lives, with a slowing down factor of about 5, repeated at least about 35 times; a cycle 
period of no more than about 2 half lives, with a slowing down factor of about 5 or about 10, repeated 
at least about 35 times; a cycle period of no more than about 2.25 half lives, with a slowing down 
factor of about 5 or about 10, repeated at least about 30 or at least about 35 times, and a cycle period 
of about 2.4 half lives, with a slowing down factor of about 5, repeated at least about 30 times. 
[0297] The cycle period may also be chosen to permit a certain chance of incorporation of a given 
number of nucleotides in a complementary strand, and the cycle may be repeated a number of times to 
analyze the sequence of various numbers of target polynucleotides of varying lengths. For example, 
the cycle period may permit about a 85% chance of incorporation of about two or less nucleotides and 



may be repeated at least about 40 times; the cycle period may permit about a 85% chance of 
incorporation of about two or less nucleotides and may be repeated at least about 50 times; the cycle 
period may permit about a 85% chance of incorporation of about two or less nucleotides and may be 
repeated at least about 60 times; the cycle period may permit about a 85% chance of incorporation of 
about two or less nucleotides and may be repeated at least about 70 times; the cycle period may 
permit about a 85% chance of incorporation of about two or less nucleotides and may be repeated at 
least about 80 times; the cycle period may permit about a 90% chance of incorporation of about two 



or less nucleotides and may be repeated at least about 40 times; the cycle period may permit about a 
90% chance of incorporation of about two or less nucleotides and may be repeated at least about 50 
times; the cycle period may permit about a 90% chance of incorporation of about two or less 
nucleotides and be repeated at least about 60 times; the cycle period may permit about a 90% chance 
of incorporation of about two or less nucleotides and be repeated at least about 70 times; the cycle 
period may permit about a 90% chance of incorporation of about two or less nucleotides and be 
repeated at least about 80 times; the cycle period may permit about a 95% chance of incorporation of 



about a 95% chance of incorporation of about two or less nucleotides and be repeated at least about 50 
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times; the cycle period may permit about a 95% chance of incorporation of about two or less 
nucleotides and be repeated at least about 60 times; the cycle period may permit about a 95% chance 
of incorporation of about two or less nucleotides and be repeated at least about 70 times; the cycle 
period may permit about a 95% chance of incorporation of about two or less nucleotides and be 
5 repeated at least about 80 times; the cycle period may permit about a 96% chance of incorporation of 
about two or less nucleotides and be repeated at least about 40 times; the cycle period may permit 
about a 96% chance of incorporation of about two or less nucleotides and be repeated at least about 50 
times; the cycle period may permit about a 96% chance of incorporation of about two or less 
nucleotides and be repeated at least about 60 times; the cycle period may permit about a 96% chance 

10 of incorporation of about two or less nucleotides and be repeated at least about 70 times; the cycle 
period may permit about a 96% chance of incorporation of about two or less nucleotides and be 
repeated at least about 80 times; the cycle period may permit about a 97% chance of incorporation of 
about two or less nucleotides and be repeated at least about 40 times; the cycle period may permit 
about a 97% chance of incorporation of about two or less nucleotides and be repeated at least about 50 

1 5 times; the cycle period may permit about a 97% chance of incorporation of about two or less 
nucleotides and be repeated at least about 60 times; the cycle period may permit about a 97% chance 
of incorporation of about two or less nucleotides and be repeated at least about 70 times; the cycle 
period may permit about a 970% chance of incorporation of about two or less nucleotides and be 
repeated at least about 80 times; the cycle period may permit about a 98% chance of incorporation of 

20 about two or less nucleotides and be repeated at least about 40 times; the cycle period may permit 
about a 98% chance of incorporation of about two or less nucleotides and be repeated at least about 50 

■ 

times; the cycle period may permit about a 98% chance of incorporation of about two or less 
nucleotides and be repeated at least about 60 times; the cycle period may permit about a 98% chance 
of incorporation of about two or less nucleotides and be repeated at least about 70 times; and the cycle 
25 period may permit about a 98% chance of incorporation of about two or less nucleotides and be 
repeated at least about 80 times. 

[0298] In addition to the Examples provided below, various cycle periods and number of times the 
cycles are repeated may be used with various numbers of targets in certain embodiments of the 
invention. These include, for example, using about 200 target polynucleotides, a period of no more 

30 than about 0.6 half lives and repeating at least about 50 times; using about 200 target polynucleotides, 1 
a period of no more than about 0.6 half lives and repeating at least about 60 times; using about 200 
target polynucleotides, a period of no more than about 0.6 half lives and repeating at least about 70 
times; using about 200 target polynucleotides, a period of no more than about 0.8 half lives and 
repeating at least about 50 times; using about 200 target polynucleotides, a period of no more than 

35 about 0.8 half lives and repeating at least about 60 times; using about 200 target polynucleotides, a 
period of no more than about 0.8 half lives and repeating at least about 70 times; using about 200 
target polynucleotides, a period of no more than about 1 half life and repeating at least about 50 times; 
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using about 200 target polynucleotides, a period of no more than about 1 half life and repeating at 
least about 60 times; and using about 200 target polynucleotides, a period of no more than about 1 
half life and repeating at least about 70 times. In any of these embodiments, signal from incorporated 
nucleotides may be reduced after each or a number of cycles. 
D. Address Identification of Randomly Attached Molecules 

[0299] Another aspect of the present invention features a method of identifying the address of a 
polynucleotide molecule randomly-bound to a substrate. In such embodiments, the polynucleotide 
molecule is allowed to attach to any random position on the surface of the substrate, and thereafter its 
position is detected by allowing an oligonucleotide primer to hybridize to a sufficiently 
complementary region of the polynucleotide molecule, and/or by allowing extension of die primer by 
nucleotides complementary to the polynucleotide molecule. In either case, detecting the location of a 
hybridized and/or incorporated nucleotide permits identification of the address of the randomly-bound 
polynucleotide molecule. Furthermore, detecting an incorporated nucleotide permits address 
identification of a polynucleotide molecule that bound as a useful template for the polymerization 
reaction. That is, it identifies the location of a randomly-bound polynucleotide molecule that attached 
to the surface in such as way as to be available to the polymerizing agent and capable of directing 
synthesis of its complementary strand. 

[0300] In some embodiments, both the primer and the incorporated nucleotide bear labeling 
moieties, where each labeling moiety produces a distinguishably detectable signal. In these 
embodiments, cross-correlation of the position of primer signal and the position of the nucleotide 
signal allows additional accuracy in locating the address, as discussed in more detail below. 
£. Single Base Extension of Randomly Attached Molecules 

[0301] Another aspect of the present invention features a method of analyzing a sequence a 
polynucleotide molecule randomly-bound to a substrate by single base extension, hi such 



e 
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iUt 



ents, the polynucleotide molecule is allowed to attach to any random position on the surface 
of the substrate, and thereafter synthesis of its complementary strand is allowed in the presence of a 
polymerizing agent Detecting incorporation of single nucleotides into the growing complementary 
strand analyzes the sequence of the randomly-bound polynucleotide molecule. 
[0302] Some embodiments use a primer bearing a labeling moiety distinguishably detectable from 
the labeling moiety used on the nucleotides being incorporated into the complementary strand This 
can allow detecting spots on the substrate to determine where the polynucleotide molecules are 
attached, and then monitoring for subsequent nucleotide incorporation events at these locations, hi 
these eml 












nts, cross-correlation of the primer signal and the nucleotide signal allows additional 
information about an address of a primer-polynucleoude complex. 

[0303] For example, the primer may bear a labeling moiety that produces fluorescence of a 
particular color (eg. green). One type of nucleotide may bear the same labeling moiety, while one or 
more other nucleotide types may bear a detectably distinguishable labeling moiety (e.g. red 
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fluorescence). If incorporation of the first type of nucleotide indicates, tor example, a wild type 
sequence, and incorporation of any other nucleotide indicates a variant, detecting different cross- 
correlated signals from the same primer-polynucleotide complex would indicate the variant both 
green and red). Conversely, detecting only one color after cross-correlating signals would indicate 
wild type (e.g. green and green). It will be appreciated that other combinations of colors can be used 
to indicate different sequences. For example, the wild type nucleotides may bear a different labeling 
moiety, and other nucleotide types may bear the same labeling moiety as the primer or no labeling 
moiety. In this scenario, cross-correlation of different signals indicates wild type, whereas a variant 
would be recognized by there being only one color from a primer-^olynucleotide complex. 
[0304] Figure 1 1, for example, shows correlation between location of labeled primer and location 
of incorporation of labeled nucleotides. In this embodiment polynucleotide molecules were 
hybridized to a fluorescently labeled primer and allowed to randomly attach to a surface via 
steptavidin and biotin with a surface density low enough to resolve single molecules. The primed 
molecules were detected through their fluorescent tags, and their locations recorded. The identified 
locations were then monitored for the appearance of fluorescence in subsequent steps. This is, die 
surface was imaged after allowing incorporation of a single fluorescently-labeled nucleotide. The 
positions of fluorescence that appeared were compared with the positions detected beforehand. 
Figure 11 also shows a correlogram summarizing the pair-wise relationships of the positions of 
detected molecules in the two fields of view, and will be detailed further in the Examples below. 
F. High Density Single Base Extension 

[0305] Another aspect of the present invention involves analyzing a plurality of polynucleotide 



molecules bound to a surface of a substrate at high density. In some embodiments, the polynucleotide 
molecules are randomly localized on the surface. Some embodiments involve allowing 
polynucleotide molecules to become coupled to the substrate at a certain density, allowing a 
nucleotide bearing a labeling moiety to become incorporated into its complementary strand, and 
detecting the incorporation. Various surface chemistries may be used to facilitate forming a dense, 
random array of primer-polynucleotide complexes, and various detection methods may be used to 
achieve single molecule resolution of the randomly-bound molecules, as described herein. 



[0306] In some eml 





am 




rTti 



lents, uie array features primer-polynucleotide complexes at a density of at 
least about 1,000 per cm 2 at random positions. In some embodiments, the density of complexes on the 
array can be at least about 2,000 per cm 2 , at least about 3,000 per cm 2 , at least about 4,000 per cm 2 , at 
least about 5,000 per cm 2 , at least about 6,000 per cm 2 , at least about 7,000 per cm 2 , at least about 
8,000 per cm 2 , at least about 9,000 per cm 2 , at least about 10,000 per cm 2 , at least about 20,000 per 
cm 2 , at least about 30,000 per cm 2 , at least about 40,000 per cm 2 , at least about 50,000 per cm 2 , at 
least about 60,000 per cm 2 , at least about 70,000 per cm 2 , at least about 80,000 per cm 2 , at least about 
90,000 per cm 2 , at least aboutlOO.OOO per cm 2 , at least about 200,000 per cm 2 , at least about 300,000 
per cm 2 , at least about 400,000 per cm 2 , at least about 500,000 per cm 2 , at least about 600,000 per 
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cm 2 , at least about 700,000 per cm 2 , at least about 800,000 per cm 2 , at least about 900,000 per cm 2 , at 
least about 1 million per cm 2 , at least about U5 million per cm 2 , at least about 2 million per cm 2 , at 
least about 2.5 million per cm 2 , at least about 3 million per cm 2 , and at least about 3.5 million per cm 2 . 

G. Sequencing a given Number of Bases on a Support 

[0307] In some embodiments, the analysis achieves incorporation of at least a given number of 
bases on a support Such embodiments involve permitting localization of a target polynucleotide on a 
surfece of a substrate, providing up to four types of labeled nucleotides, where each of the types 
comprises a labeling moiety, and allowing incorporation of a given number of the nucleotides into the 
complementary strand in the presence of a polymerizing agent, detecting the incorporation after 
incorporation of one or more of the nucleotides. As in other embodiments, the nucleotides may be 
provided sequentially or simultaneously, and the target may be analyzed in bulk or as a single copy. 

H. De Novo Sequencing 

[0308] In some embodiments, the analysis is used to analyze the sequence of a substantially 
unknown sequence, i.e., in de novo sequencing. Any of the aspects, embodiments and/or variations of 
the present invention may be used. Certain embodiments can facilitate de novo sequencing of about 
5 bases, about 6 bases, about 7 bases, about 8 bases, about 9 bases, about 10 bases, about 20 bases, 
about 50 bases, about 100 bases, about 150 bases, about 200 bases, about 250 bases, about 300 bases, 
about 350 bases, about 400 bases, about 450 bases, about 500 bases, about 550 bases, about 600 
bases, about 650 bases, about 700 bases, about 750 bases, about 800 bases, about 850 bases, about 
900 bases, about 950 bases, about 1000 bases, about 1 100 bases, about 1200 bases, about 1300 bases, 
about 1400 bases, about 1500 bases, about 1600 bases, about 1700 bases, about 1800 bases, about 
1900 bases, about 2000 bases, about 2500 bases, about 3000 bases, about 3500 bases, about 4000 
bases, about 4500 bases, about 5000 bases, about 5500 bases, about 6000 bases, about 6500 bases, 
about 7000 bases, about 7500 bases, about 8000 bases, about 8500 bases, about 9000 bases, about 
9500 bases, about 10,000 bases, including at least about 10,000 bases. 

I. Re-Sequencing 

[0309] In some embodiments, the analysis is used to analyze the sequence of a substantially known 
sequence, i.e., in re-sequencing. Any of the aspects, embodiments and/or variations of die present 
invention may be used. Certain embodiments can facilitate re-sequencing of about 5 bases, about 6 
bases, about 7 bases, about 8 bases, about 9 bases, about 10 bases, about 20 bases, about 50 bases, 
about 100 bases, about 150 bases, about 200 bases, about 250 bases, about 300 bases, about 350 
bases, about 400 bases, about 450 bases, about 500 bases, about 550 bases, about 600 bases, about 
650 bases, about 700 bases, about 750 bases, about 800 bases, about 850 bases, about 900 bases, 
about 950 bases, about 1000 bases, about 1 100 bases, about 1200 bases, about 1300 bases, about 1400 
bases, about 1500 bases, about 1600 bases, about 1700 bases, about 1800 bases, about 1900 bases, 
about 2000 bases, about 2500 bases, about 3000 bases, about 3500 bases, about 4000 bases, about 
4500 bases, about 5000 bases, about 5500 bases, about 6000 bases, about 6500 bases, about 7000 
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bases, about 7500 bases, about 8000 bases, about 8500 bases, about 9000 bases, about 9500 bases, 
about 10,000 bases, about 20,000 bases, about 30,000 bases, about 40,000 bases, about 50,000 bases, 
about 60,000 bases, about 70,000 bases, about 80,000 bases, about 90,000 bases, about 100,000 bases, 
about 150,000 bases, about 200,000 bases, about 250,000 bases, about 300,000 bases, about 350,000 
bases, about 400,000 bases, about 450,000 bases, and at least about 500,000 bases. 
[0310] In some embodiments, immobilized template molecule can be used repeatedly, by 
denaturing the extended molecule, removing me newly-synthesized complementary strand, annealing 
a new primer, and then repeating the experiment with fresh reagents to sequentially analyze the 
sequence of the same target polynucleotide. This approach is very sensitive because only a single 
copy of the template molecule is needed to obtain sequence information. Further, releasing the 
extension product from the polynucleotide template, e.g., by denaturing, and annealing the template 
with a different primer, provides the opportunity to re read the same template molecule with different 
sets of nucleotides (e.g., different combinations of two types of labeled nucleotides and two types of 
unlabeled nucleotides). 

[0311] In some embodiments, nucleotides lacking any labeling moiety are provided for a period of 
time to allow unlabeled nucleotides to "fill in" regions, for example regions that are an already 
known, until the complementary strand extends to reach unknown regions further downstream. At 
this point, nucleotides bearing a labeling moiety can be added and analysis begun or continued. 

vm. Applications 

The methods and kits of the present invention find numerous applications, as featured 



[0312] 

below. 

A. Polynucleotide Counting and Identification 

[0313J Another aspect of the present invention involves counting or enumerating a number of 
copies of a target polynucleotide by synthesizing complementary strands. Such embodiments involve 
allowing the target polynucleotide to become coupled to a random position on a substrate, detecting 
incorporation of a sufficient number of nucleotides into me complementary stand to identify the 
target, and counting the synthesized complementary strands for the identified target The number of 
incorporations needed to identify the target polynucleotide may be at least about 15, at least about 16, 
at least about 17, at least about 18, at least about 19, at least about 20, at least about 21, or at least 
about 22. 

B. DNA Applications 



[0314] In some embodiments, me target polynucleotide is DNA, fin: example DNA composing at 
least 50% of a genome of an organism. Some embodiments further comprise identifying and/or 
counting a gene sequence of more than one cell, and correlating sequence information from the 
various cells. Such embodiments find application in medical genetics. Other embodiments compare 
DNA sequences of normal cells to those of non-normal cells to detect genetic variants. Identification 
of such variants finds use in diagnostic and/or prognostic applications. 
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Genetic Cancer Research 



In some ei 



lllllllli llliHllK 



: f the present invention features a method of doing genetic cancer 
research, where sequence information from a cancer cell is correlated with information from a non- 
cancer cell or with another cancer cell in a different stage of cancer. In certain embodiments, 
sequence information may be obtained, for example, for at least about 10 cells, for at least about 20 
cells, for at least about 50 cells, for at least about 70 cells, and for at least about 100 cells. Cells in 
different stages of cancer, for example, include a colon polyp cell vs. a colon cancer cell vs. a colon 
metastasizing cell from a given patient at various times over the disease course. Cancer cells of other 



types of cancer may also be used, including, for example a bone cancer, a brain tumor, a breast 
cancer, an endocrine system cancer, a gastrointestinal cancer, a gynecological cancer, a head and neck 
cancer, a leukemia, a lung cancer, a lymphoma, a metastases, a myeloma, a pediatric cancer, a penile 
cancer, a prostate cancer, a sarcoma, a skin cancer, a testicular cancer, a thyroid cancer, and a urinary 
tract cancer. 

[0316] In such embodiments, enumeration may determine changes in gene number, indicating, for 
example that a gene appears three times instead of two times (as in a trisomy) or a gene faUs to appear 
(such as a homozygous deletion). Other types of allelic loss and changes change in diploidy may also 
be determined, including changes related to, for example, a somatic recombination, a translocation, 
and/or a rearrangement, as well as a sporadic mutation. 



[0317] Such 



lents find use in diagnostic and prognostic applications, also featured in the 
present invention. For example, a homozygous deletion may indicate certain forms of cancer. It will 
be appreciated by those of skill in the art that other diseases, disorders, and/or conditions may also be 
identified based on recognized changes in dipoidy. For example, three copies of chromosome 21 
genes can indicate trisomy 21, associated with Down syndrome. 
ii Detection of Genetic Variants ' 

[0318] Methods of the present invention allow rapid analysis of DNA sequences at die single 
molecule level, lending themselves to applications relying on detailed analysis of individual 
sequences. Additional aspects of the present invention include such applications. 
[0319] For example, certain embodiments provide for SNP detection, by identifying incorporation 
of a single nucleotide into a complementary strand of a target polynucleotide sequence at the site of a 
known SNP. Any of the variations, eml 



ti i:»it mux 



ats, and/or aspects of the present invention may be 
used for such SNP detection. Such methods can also be used to identify other variants due to point 
mutations, including a substitution, frameshift mutation, an insertion, a deletion, and inversion, a 
missense mutation, a nonsense mutation, a promoter mutation, a splice site mutation, a sporadic 
mutation and die like. 

[0320] Moreover, die invention also features methods of diagnosing a metabolic condition, a 
pathological condition, a cancer and other disease, disorder or condition (including a response to a 
drug) by identifying such genetic variants. For example, a known wild type versus a known variant 
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can be dis tin gu i s he d using two detectably distinguishable labeling moieties. Suppose a G at a 



particular position ma 



ites wild type, while a C at that position indicates a variant of interest. By 
using G's bearing one detectable labeling moiety, and C's bearing a detectably distinguishable 
labeling moiety, whether a target polynucleotide exhibits the wild type or variant sequence can readily 



be determined by the methods of the present invention. 

[0321] Certain embodiments provide for detection of additional genetic variants, by identifying 
incorporation of more than one nucleotide into a complementary strand of a target polynucleotide 
sequence, either at substantially known regions of variation or at substantially unknown regions. Any 
of the variations, embodiments, and aspects of the present invention may be used for such detection. 
Comparison of sequences from more than one individual allows identification of genetic variants, 
including substitutions, frameshift mutations, insertions, deletions, inversions, missense mutations, 
nonsense mutations, promoter mutations, splice site mutations, sporadic mutations, a duplication, 
variable number tandem repeats, short tandem repeat polymorphisms, and the like. 
[0322] Moreover, the invention also provides methods of diagnosing a metabolic condition, a 
pathological condition, a cancer, and/or other disease, disorder or condition (including a response to a 
drug) by identifying such genetic variants. For example, in some embodiments, the identified 
nucleotide variant indicates adenomatous polyposis coli, adult polycystic kidney disease, al- 
anti trypsin deficiency, cystic fibrosis, duchenne muscular dystrophy, familial hypercholesterolemia, 
fragile X syndrome, hemochromatosis, hemophilia A, hereditary nonpolyposis colorectal cancer, 





u 








mm 



imperfecta, phenylketonuria, retinoblastoma, sickle cell disease, Tay-Sachs disease, or thalassemia, as 
well as cleft hp, club foot, congenital heart defects, neural tube defects, pyloric stenosis, alcoholism, 
Alzheimer disease, bipolar affective disorder, cancer, diabetes type I, diabetes type n, heart disease, 
stroke, or schizophrenia. 
C. RNA Applications 

[0323] In some embodiments, the target polynucleotide is RNA, and/or cDNA copies 



corresponding to RNA. In some embodiments, the RNA includes one or more types of RNA, 
including, for example, mRNA, tRNA, rRNA, and snRNA. In some embodiments, the RNA 
comprises RNA transcripts. 



30 [0324] Some ei 



iiii1»t«imil^i1»: 



use a primer that hybridizes to the target polynucleotide whose 
complementary strand is to be synthesized. In some of (hose embodiments, the primer used comprises 
a polyT region and a region of at least two degenerate nucleotides. This facilitates identification 
and/or counting of random mRNA sequences in eukaryotic cells, as the polyT can hybridize to the 
polyA region of the mRNA and the degenerate nucleotides can hybridize to corresponding random 
sequences. Such primers also avoid sequencing the polyA tail itself, 

[0325] In some embodiments, the RNA comprises RNA molecules from a cell, from an organelle, 
and/or from a microorganism. The number of RNA molecules may be about 100, about 200, about 
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300, about 400, about 500, about 600, about 700, about 800, about 900, about 1,000, about 2,000, 
about 3,000, about 4,000, about 5000, about 6,000, about 7,000, about 8,000, about 9,000, about 
10,000, up to an including all of the RNA molecules in the cell, organelle, and/or microorganism. 
Some embodiments comprise identifying and counting RNA molecules from more than one cell, 
5 organelle, and/or microorganism. A histogram of the copy numbers of various types of RNA 
molecules identified can be constructed for different cells, organelles and/or microorganisms, and 
used to compile transcriptional patterns of RNA complements for each analyzed cell. The different 
cells, organelles, and/or microorganisms may be in different states, e.g. a diseased cell vs. a normal 
cell; or at different stages of development, e»g. a totipotent cell vs. a phiripotent cell vs. a 
10 differentiated cell; or subjected to different stimuli, e.g. a bacterial cell vs. a bacterial cell exposed to 
an antibiotic. In some embodiments, the methods can detect any statistically significant difference in 
copy numbers between cells, organelles, and/or microorganisms. 
i. Identifying Unknown RNA molecules 

[0326] Such sequence information finds use in a number of applications featured by the present 
15 invention. For example, an aspect of the present invention involves identifying unknown RNA 
molecules. In some embodiments, the methods facilitate detection of RNA molecules in a cell limited 
only by Poisson statistics. In such embodiments, for example, determining copy number of RNA 
molecules can identify untranslated sequences and/or hitherto unknown RNA molecules that are 
ordinarily present in low or very low copy numbers, such as about one, about two, about three, about 
20 four, about five, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 15, about 20, 
and about 25. 

1L Annotating Genomes 

[0327] The invention also features an approach to annotating genomes based on counting and 
identifying RNA transcripts. The identified transcripts indicate, for example, how sequenced genes 
25 are actually transcribed and/or expressed. By comparing the analyzed sequence of an identified 
transcript to one or more predicted expressed sequences, the prediction can be confirmed, modified, or 
refuted, providing a means to annotate genomes. 
Hi Tissue Engineering 

[0328] Another application featured in the present invention involves methods of tissue 
30 engineering. Such embodiment provide for analyzing a plurality of RNA molecules of a cell at 
different stages of differentiation towards a particular tissue type, compiling information about 
transcriptional patterns of the RNA molecules (eg. copy number and identity), and causing a target 
cell to feature a similar transcription pattern, thereby engineering a cell-type of the tissue. 
[0329] The differentiated state may be that of a heart cell, a pancreatic cell, a muscle cell, a bone 
35 cell, an epidermal cell, a skin cell, a blood cell, a nerve cell, a mammary gland cell, a cell of the 
olfectory epithelium, a cell of the auditory epithelium, a cell of the optic epimeliurn, an endodermal 
cell, a lung cell, an alveoli cell, a cell of the respiratory epithelium, an intestinal cell, an absorptive 
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cett, a goblet cell, a Paneth cell, an enteroendocrine cell, a liver cell, a mesodermal cell, a blood vessel 
cell, and an endothelial cell. 

/v. Determining Phylogenic Relationships 



[0330] Still a 



r feature of the present invention involves methods of detemiining phyiogenic 
relationships of various species. Such embodiments provide for compiling transcriptional patterns of 
cells from different species and analyzing the relationships amongst homologous transcripts. Such 
information finds use in determining evolutionary relationships amongst species. 

v. Determining Cellular Responses to Stimuli 

[0331] Another feature of the present invention involves a method of detennining a 
microorganism's response to various stimuli, for example, response when exposed to a drug or 
subjected to other treatment, such as being deprived of certain metabolites. In such embodiments, 
transcriptional patterns of a cell of the microorganism, for example a bacteria cell, can be compared 
before and after administration of the drug or other treatment. 

vi. Identifying Alternative Splice Sites 

[0332] Certain embodiments provide for detection of alternative splice sites, by identifying 
incorporation of a nucleotide into a complementary strand of a target polynucleotide sequence, either 
at known regions of a splice site or at unknown regions. Any of the variations, embodiments, and/or 
aspects of the present invention may be used for such detection. Comparison of sequences from more 
than one RNA molecule allows identification of alternative splice sites. In some embodiments, a 
primer can be allowed to hybridize to a region on the target RNA molecule within one or more 
nucleotides downstream of the region of interest, i.e, the expected slice site. Incorporation of 
nucleotides can then be allowed to proceed, extending the primer towards the region of interest, at 
least tar enough to identify the concatenated exon. 

[0333] Moreover, the invention also provides methods of diagnosing cancer and other diseases, 
disorders and/or conditions, including, for example, sickle cell anemia, by identifying such alterations 
in splicing. For example, in some embodiments, the identified nucleotide variant indicates 
adenomatous polyposis coli, adult polycystic kidney disease, a 1 -antitrypsin deficiency, cystic fibrosis, 
duchenne muscular dystrophy, familial hypercholesterolemia, fragile X syndrome, hemochromatosis, 
hemophilia A, hereditary nonpolyposis colorectal cancer, Huntington disease, Marfan syndrome, 
myotonic dystrophy, neurofibromatosis type 1, osteogenesis imperfecta, phenylketonuria, 
retinoblastoma, sickle cell disease, Tay-Sachs disease, or thalassemia, as well as cleft lip, club foot, 
congenital heart defects, neural tube defects, pyloric stenosis, alcoholism, Alzheimer disease, bipolar 
affective disorder, cancer, diabetes type I, diabetes type II, heart disease, stroke, or schizophrenia, 
[0334] Many modifications and variations of this invention can be made without departing from its 
spirit and scope. The specific embodiments described below are fin: illustration only and are not 
ded to limit the invention in any way. All publications, figures, patents and patent applications 
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cited herein are hereby expressly incorporated by reference for all purposes to the same extent as if 
each was so individually denoted 



EXAMPLES 
5 Example 1 

Basic Materials and Methods 

1. Materials and Reaction Reagents 

[0335] (1) Solutions and Buffers 

[03361 RCA: H20*JH40H:H202 (6:4:1) boiling for an hour. 

1 0 [0337] PEI: PolyE%lenImine (Sigma P-3 143) (positive charged) 

[0338] PALL: Poly(allylamine hydrochloride) (Sigma 283223) 

[0339] PACr: Poly(acrylic acid, sodium salt) (Sigma 416045) (negative charged) 

[0340] EDC: 9.6 mg/ml; 50 mM (xlO) l-{3-(Dimethylammo)pro 

[0341] hydrochloride), Activator for the BLCPA (Sigma-161462) 

15 [0342] BLCPA: EZ-Link Biotin LC-PEO-Amine (Pierce 21347) Stock solution 50 mM in MES 10 
mM(21 mgfoil)(xl0) 

[0343] Streptavidin plus-1 mg/ml in Tris. PROzyme, Code: SA20 (x 1 0) 
[0344] Buffers: 

■ 

[0345] MES (N-morpholinoeuianesulfonic acid) PH 5.5 1M (100x) 
20 [0346] TRIS 10 mM 

[0347] TRIS-MgC12 10 mM Tris, 100 mM MgC12 (xl) 

[0348] TKMC (10 mM Tris*HCi, 10 mM KC1, 10 mM MgC12, 5 mM Ca C32, pH 7.0) 

[0349] EcoPol: 10 mM Tris*HCl, 5 mM MgC12, 7.5 mM DTT pH@ 25° C; buffer come with the 

polymerase at (xlO) 
25 (2^ Other Materials and Reagents 

[0350] Nucleotides: dTTP, dGTP, dATP, and dCTP-Cy3 at 1 0 uM concentration 

[0351] Polymerases) Klenow Polymerase I (5 units/pj), New England BioLabs Cat 2 10S 

[0352] b) Klenow-exo, New England BioLabs Cat 212S 

[0353] c) TAQ 
30 [0354] d) Sequenase 

[0355] Hybridization Chamber: Sigma H-1409 

[0356] Polynucleotide templates and primers: 

[0357] 7.: Biotin-5 '-tcagtcatca gtcatcagtc atcagtcatc agtcatcagt catcagtcat cagtcatcag tcatcagtca 
tcagtcatca gtcatcACAC GGAGGTTCTA-3' (SEQ IDNO:l) 

35 [0358] Primer p7G: 5-TAGAACCTCCGTGT-3' (SEQ ID NO:2); the primer can be labeled with 
Cy5orCy3. 
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[03S9] Mu50: Biotm S'^tccagcgtgttttatctctgcgagcataatgtxrtgcgtcatwgcc^ 3' (SEQ ID NO:3) 
[0360] Cy5 labeled primer (PMu50Cy5): Cy5 5'-gctggcggatgac-3' (SEQ ID NO:4) 
[0361J 7.7A-Biotin-5'- 
tttGcttettAttctttGc^ 
5 GGTTCTA-3' (SEQ ID NO:5) 

[0362] 6A6CG: Biotin-5'^cAttttu^^ 
CACGGAGGTTCTA-3 ', (SEQ ID NO:6) 
2. Substrate Treatment and Template Attachment 

[0363] A fused silica microscope slide (1 mm thick, 25x75 mm size, Esco Cat R130110) was used 
10 to attach DNA templates. The slides was first cleaned with the RCA method as described above and 
in WO 01/32930. Multilayer of polyallylarnine/polyAcrylic were absorbed to the slide. An EZ link 
connector was then attached to the slides as follows: the slide was dried, scratched with diamond 
pencil, and then covered with a hybridization chamber. 120 ul of a mixture of 1 :1 :8 EDC: BLCPA: 
MES (50 mM EDC, 50 mM BLCPA, 10 mM MES) was applied to each slide. Following incubation 
15 for 20 minutes, 120 ul of Streptavidin Plus diluted to 0.1 mg/ml was added to the slide. After 20 min 
of incubation, the slide was washed with 200 ul of Tris 10 mM. 

[0364] Preparation of 10 pM Oligo: the 7G oligonucleotide template (SEQ ID NO:l) was pre- 
hybridized with Cy5-labeled primer (SEQ ID NO:2) (in stock at 7 uM) in TRIS-MgC12 buffer. The 
treated slide was examined for contamination with the TIR microscope. 200 ul of the 
20 oligonucleotide/primer mixture was applied to each slide. Following incubation for 10 min, the slide 
was washed with 200 ul ml of Tris 10 mM. 

[0365] Addition of nucleotides and polymerase: nucleotides dTTP, dATP, dGTP, and Cy3-dCTP 
each of 20-100 nM were mixed in the ECOPOL buffer. 1 ul Klenow 210S from stock solution (kept 
in -20° C.) was added to 200 microliters of the nucleotide mixture. 120 ul of the mixture was then 

25 added on each slide. After incubation for 0 to 30 min (for different experiments), the slide was 
examined with the TIR microscope. Unless otherwise noted, all reactions were performed at room 
temperature, while the reaction reagents were kept at 4° C. or -20° C. The primer/oligonucleotide 
hybridization reaction was carried out with a thermocycler machine. 
[0366] Single molecule resolution was achieve by using very low concentration of the 

30 polynucleotide template which ensured that only one template molecule is attached to a distinct spot 
on the slide. Single molecule attachment to a distinct is also confirmed by the observation of single 
bleaching pattern of the attached fluorophores. In the reaction described above, a concentration of 
about 10 pM of a 80-mer oligonucleotide template was used for immobilizing to the slide. The space 
between different DNA molecules attached to the surface slide was measured at a few micrometers. 

35 Imaging with Single Molecule Resolution 
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[0367] As illustrated in Figure la, incorporation of a single nucleotide molecule into the 
complementary strand of a single target polynucleotide molecule can be detected and imaged 
according to the present/ invention. Figure la illustrates two different target polynucleotides analyzed 
in parallel on the surface of a substrate. Incorporation of, for example, an labeled adenine nucleotide 
5 (A*) into a complementary stand of one of the target polynucleotides is visualized on the surface, as 
indicated by the spot shown in the top view. Later, incorporation of, for example, a labeled thymine 
nucleotide (T*) into the complementary strand of a different target polynucleotide can be seen as a 
spot on a different position in the field of view, corresponding to a different location on the surface of 
the substrate. If nucleotides incorporate into both stands, for example two A*'s, two spots at 
10 corresponding positions can be detected, indicating incorporation into the complementary strands of 
the two individual target polynucleotides. 

[0368] As illustrated in Figure lb, the single stranded oligonucleotide template (SEQ ID NO: 1) 
primed with a Cy5 labeled primer sequence (SEQ ID NO:2) was immobilized at a single molecule 
resolution to the surface of a silica slide using a biotin-streptavidin bond. The surface is coated with 
1 5 polymers on which biotin (EZ link) is tethered The oligonucleotide template, with a biotm molecule 
attached to one of its ends, was able to attach to the streptavidin-linked surface. The slide surface was 
negatively charged which helps to repeal unbound nucleotides The DNA is specifically attached to the 
surface by its 5' side, meaning that the primer— which the polymerase extends — is away from the 
surface. 

20 [0369] The template and incorporation of labeled nucleotides were visualized by fluorescence 

imaging- Location of the oligonucleotide was monitored by fluorescence from the Cy5 labeled primer 
(SEQ ID NO:2). Incorporation of nucleotides was detected because the nucleotides were labeled with 
Cy3. After incorporation, the incorporated labels were illuminated. Illumination of Cy3 was at a 
wavelength of 532 nm. Following a typical time of a few seconds of continued iUumination, the 

25 signals were bleached, typically in a single step. 

[0370] As shown in Figure 2, imaging of fluorescent signals with single molecule resolution was 
enabled with surface illumination by total internal reflection (7TR). Ishijirna et al. (Cell 92:161-71, 
1998) showed that it is possible to observe the fluorescence of single molecules immobilized to a 
surface in a wet environment even when there are free molecules in the solution. Here, the HR was 

30 facilitated by a dove prism coupling of the laser beam to the silica slide surface. An upright 

microscope with an immersion oil objective was used to image the surface with an intensified CCD 
(PentaMax). A filter set (Chroma) was used to reject the illumination frequency and let the 
fluorescence frequency to reach the ICCD. 
Example 2 

35 Test for Specific Attachment of Template Molecules to Substrate Surface 

[0371] This experiment was performed to determine whether the polynucleotide templates are 
attached to the surface as desired Figure 3 shows that streptavidin is required for binding the 
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template to the surface and hence detection of incorporated fluorescence signal. The left panel shows 
that there is no fluorescence signal when only streptavidin-attached surface but no fluorescent labels 
were present The middle panel shows that there is no incorporated fluorescent signals when no 
streptavidin was present on the surface to attach biottn-labeled oligonucleotide template, even though 
5 Cy5-labeled primer was present The right panel shows that detection of incorporated fluorescent 
signal when the streptavidin-attached surface, labeled primers, and biotin-labeled oligonucleotide 
template were present. 
Example 3 

Determining Processivitv of DNA Polymerase in the Presence of Labeled Nucleotides 
10 [0372] To determine whether the DNA polymerase accurately incorporates labeled nucleotides into 
the template, a bulk extension experiment was performed in a test tube rather than on the surface of a 
substrate. As shown in Figure 5, the results indicate that the polymerase incorporate all the labeled 
nucleotides into the correct positions. In mis experiment, incorporation of dCTP-Cy3 and a 
polymerization terminator, ddCTP, were detected using a 7G DNA template (a DNA strand having a 
15 G residue every 7 bases; SEQ ID NO:l). The annealed primer was extended in the presence of non- 
labeled dATP, dGTP, dTTP, Cy3-labeled dCTP, and ddCTP. The ratio of Cy3-dCTP and ddCTP was 
3:1. The reaction products were separated on a gel, fluorescence excited, and the signals detected, 
using an automatic sequencer ABI-377. The results reveal that incorporation of Cy3-dCTP did not 
interfere with further extension of the primer along the 7G oligomer template. 
20 [0373J Figure 5 shows fluorescence intensity from primer extension products of various lengths 
which were terminated by incorporation of ddCTP at the different G residues in the 7G oligomer 
template (SEQ ID NO:l). The first band is the end of the gel and should not be counted as it is in the 
very beginning of the gel. The full length of the template is 100 residues. Hie first band (marked "1" 
in the graph) corresponds to extension products which were terminated by incorporation of non- 
25 labeled ddCTP at the second G residue (position 27) and has incorporated Cy3-dCTP at the first G 
residue (position 20). Similarly, the tenth band (marked "10" in the graph) represents extension 
products which were terminated by incorporation of non-labeled ddCTP at the 10th G residue 
(position 90) and has incorporated Cy3-dCTP at the previous G residue (ie. t positions 20, 27, 34, 41, 
48, 55, 62, 69, 76, and 83). The results showed a nice agreement between the expected positions for 
30 Cy3 incorporation in the polynucleotide template and the positions of the fluorescence intensity 
bands. 
Example 4 

Detection of Single Nucleotide Incorporation bv TIR 

[0374] Total internal reflection (TIR) fluorescence microscopy allows detection of real-time 
35 incorporation of labeled nucleotide into single immobilized polynucleotide template. This 

illumination method reduce the background from the sample by illuminating only a thin layer (e.g., in 
the order of 1 50 run) near the surface. Even in the presence of free dyes in the solution (up to 50 nM), 
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single molecules can be observed Using HR, we visualized single molecules of labeled nucleotide 
bound to DNA in the presence of up to 50 nM free dye in solution. Though this concentration is low 
compared to the concentration needed for a high rate of incorporation of nucleotides by the DNA 
polymerase, it was sufficient for its operation. 
5 Optical Setup 

[03751 The lasers source is shown in Figure 2, the light sources (e.g., laser) are coupled to the 
surface by prism. The surface is imaged by a regular 1.3 NA microscope objective onto an Intensified 
CCD (Pentamax). A fluorescent filter in the optical way block the laser intensity and allow the 
fluorescent signals from the dye molecules pass through(Chroma filters). Optionally, the camera and 
10 the shutters for the lasers are controlled by the computer. 
Dlumination 

[0376] As shown in Figure 6, TTR illumination of polynucleotide-attached slide produced a low 
background and allowed detection of signals only from immobilized labels. The refraction index of 
the fused silica glass and the oil beneath the surface is about 1.46. The refraction index of the liquid 
1 5 above the glass is about 1 .33 to 1 .35. At the interface of the glass and the water the illumination ray 
was refracted. If the illumination is very shallow, 70-75 degree from the surface orthogonal, the 
refracted light was reflected back and not continued in the liquid phase as the critical angel for total 
internal reflection is about 65-67 degrees (TetaCitical=sirr-l(nl/n2)). 

[0377] The illumination process, called evanescent illumination, leaves a decay field near the 
20 interlace which illuminates only about 1 50 nm into the liquid phase. Fluorophores dyes can be excited 
by tiiis field. So only the dyes which are near the surface will emit Furthermore, tree labeled 
nucleotide molecules in the solution will move around due to Brownian motion. The fast movement 
of these free molecules produces only a smear signal because the integration time is in the order of 
hundred millisecond. Thus, the total internal reflection Mumination leads to a low back ground from 
25 the free molecules, and only signals from the immobilized dyes are detected. 
Detection of Single Molecules 

[0378] Figure 6 shows detection of signals from single Cy3 molecule with no free dye in solution 
versus signals from single Cy3 molecule with background of 15nMCy3 in solution. Fluorescence 
image from incorporation of Cy3 labeled nucleotide is shown in the upper panels. The signals tend to 

30 bleach in a single step, see the upper graph. When there are free labeled nucleotides in the solution (15 
nM free dye), the background signal is stronger (lower right panel) than the background signal in the 
absence of free labeled nucleotides in the solution. But the signal from the incorporated single 
molecule can still be detected The ability to detect single molecule in the presence of free dye enables 
one to follow incorporation of nucleotide into an immobilized DNA template in real time. 

35 [0379] The upper left panel of Figure 6 showed typical images of single molecules (see the bright 
spots). When the intensity of a spot is traced in real time (upper right panel), one can see that it 
appears (incorporation event or sticking to the surface event) and disappears (bleaching or detaching 
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event). Hie same results are also illustrated in the middle long thin panel of Figure 6. This panel 
shows successive images of a small area around toe spot that was being traced. The fluorescent signal 
appeared and disappeared after every few seconds (every frame is a second exposure). 
Example 5 

Determining Nucleoti de Incorporation Based on Correlation of Fluorescence Sp ots 
[0380] A correlation was observed between the position of the immobilized DNA template on the 
surface (indicated by the fluorescently labeled primer) and the incorporation of nucleotide to the 
surfece. In Figure 4, image of the immobilized DNA which was hybridized to the Cy5 labeled primer 
was shown in the upper two panels (the middle panel is a magnified image of a small area in the left 
panel). The small dots in the image represent likely positions of the DNA templates immobilized on 
the surface. The fluorescence signals were then bleached out by a long radiation (about 1 minute) at 
635 nm with a 10 mW laser diode. Subsequently, the polymerase and the nucleotides (including the 
Cy3-labeled dCTP) were added, and the mixture incubated at room temperature for about an hour. 
After washing, a second image of the surface was taken. This time a new set of fluorescence-labeled 
points appeared (see lower left two panels). The results indicate that the two sets of fluorescently- 
labeled points are correlated (see right panel). It is noted that the significant overlap (about 40%) 
between DNA primer location (Cy5) and dCTP Incorporation location (Cy3) cannot be a random 
result Under the concentrations of labeled DNA primers used in the experiment, the probability for 
tins correlation to occur randomly calculated to be about 10-50. Rather, the correlation is due to 
incoiporation of the Cy3 labeled nucleotides into the immobilized, CyS labeled primer. 
[0381] Incorporation of labeled nucleotide into the immobilized template is also demonstrated by 
toe multi-incorporation data shown in Figure 7. When the intensity of the spots in Figure 4 were 
measured, a multistep bleaching is observed (Figure 7, upper left panel). Simulation of the multiple 
bleaching is shown in the upper right panel. The results are what should be expected if few molecules 
are located in the same place up to the optical resolution. This indicates that the polymerase can 
incorporate a few labeled nucleotides into the same DNA template, hi a control experiment, ddATP, 
dCTP-Cy3 and dGTP were used to extend Cy5-labeled primer PMu50Cy5, Cy5 5'~gctggcggatgac-3' 
(SEQ ID NO:4) along the Mu50 oligonucleotide template (SEQ ID NO 3). This allows only one Cy3- 
labeled nucleotide to be incorporated into the primer because the first codon in the template sequence 
after the primer is CGT. Incorporation of ddATP immediately after the incorporation of dCTP-Cy3 
terminates the elongation. As shown in the lower right panel, there is no multibleaching. 
[0382] It is noted that because the concentration of the DNA template on the surface was so low, it 
is unlikely that more than one copy of die DNA template were present on each spot Further, multiple 
bleaching is not common when the polymerase was not present (data not shown). In particular, there 
is no correlation between primer location and fluorescence signal from the surface when the 
polymerase was not present (see, e.g., Figure 13, middle panel). 
Example 6 
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TVynflmics gQfe^sg figfi Incorporation 

[0383] Figure 8 shows a time course of incorporation events during the DNA polymerase reaction. 
In this experiment, the DNA template and Cy5-labeled primer complex was immobilized to the 
substrate surface as described above, and its position was imaged The DNA Polymerase was then 
5 added along with the nucleotides of which one was labeled with Cy3 . 

[0384] As indicated in the figure, the substrate was imaged every 10 sec, with a 1 sec exposure. 
Every spot with immobilized DNA template (as indicated by the labeled primer) was monitored as a 
function of time. A series of small images of these spots were placed along a strip resulting in a movie 
showing the "activities" at each point 

1 0 [0385] Repeated incorporation of nucleotide into the DNA template was shown in Figure 9. Using 
more dyes will enable us to read the sequence of the DNA directly in an asynchronous manner Figure 
9 shows the dynamic incorporation events at 8 different spots. The digital information recorded in 
these movies indicate that repeated incorporation events occurred at various time points. The data also 
demonstrated the feasibility of monitoring primer extension activities on single DNA molecules. 

1 5 [03861 Figure 10 shows a histogram of the number of incorporation events on single spots and a 
histogram of the time between incorporation events. From the histograms one can see that a few 
nucleotides were incorporated into single DNA molecules. The low numbers of events in which more 
then three nucleotides were incorporated indicate feat there is some mechanism that prevents high 
number of incorporation into the DNA under the experimental conditions. The reason could be that 

20 photo-damage to the DNA in the surrounding area of the illuminated dye might produce toxic 

radicals. Changing the reaction conditions and reagents could increase the numbers of incorporated 
nucleotides dramatically. 
Example 7 

Base-bv-base S equence Analysis 

25 [0387] This experiment was performed to confirm selectivity of the polymerase and to illustrate 
feasibility of determining the sequence of a polynucleotide template with base-by-base scheme. 
[0388] First, fidelity of the polymerase in incorporation was confirmed by analyzing correlation 
between location of immobilized primer and location of nucleotide incorporation with a correlation 
graph Figure 11 shows correlation between primer location and polymerase activity location. The 

30 position of each point was determined with a sub pixel resolution. Images for the primer location and 
the incorporation position were taken first If there is a correlation between the two, there is a pick in 
the correlation graph. Otherwise no pick was observed. As shown in the figure, the two images 
correlate with each other. 

[0389] Results demonstrating base-by-base analysis of the sequence of a immobilized template at 
3 5 single molecule resolution is shown in Figure 12. The data indicated that at least two bases of the 
template were determined by flowing in and out reagents along with different types of labeled 
nucleotides (e.g. t dCTP-Cy3, dUTP-Cy3, etc.). Here, a 6TA6GC oligonucleotide template (SEQ ID 
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NO:6} was immobilized to the fused silica slide. A Cy3-labeled p7G primer (SEQ ID NO:2) was 
annealed to the template. As illustrated in the Figure, the primer was first extended up to the A residue 
with non-labeled dATP nucleotides. Then, dUTP-€y3 nucleotide was incorporated and imaged. 
Images taken at this time show high correlation (see the upper left correlation graph). After bleaching 
5 the dyes, dCTP-Cy3 was applied to the sample. Images taken at this time show low correlation (see 
the lower left correlation graph). Hiereafter, non-labeled dGTP was added to fill the CCCCC gap till 
the G residue in the sequence., At this time, incorporation of a dCTP-Cy3 nucleotide was examined 
again. This time there was a correlation between the dCTP-cy3 positions and the primer positions in 
general, and in particular there was a correlation with the position of the incorporated dUTP in the 

1 0 first incorporation cycle. Thereafter, dUTP-Cy3 was added Correlation was found between the 
labeled primer position and signal from dUPT-Cy3, but no correlation was found between the new 
dUPT-Cy3 positions and the position that has incorporated dUTP in the first incorporation cycle 
(lower right graph). The interpretation is that not all the primers were extended in fee first dUTP 
incorporation cycle, that those which did not get extended could incorporate dUTP in the second 

1 5 incorporation cycle, and that those which did incorporate dUTP in the first cycle could not incorporate 
dUTP again in the second cycle. The results indicate that on those spots which have incorporated the 
first U residue there were also incorporations of a C but not a U residue. Thus, identity of a second 
base can be determined with the experimental scheme, although the yield for the second base (upper 
right graph) was not as good as for the first base (upper left graph). 

20 [0390J In a control experiment, after filling in with A residues, dCTP-Cy3 (wrong nucleotide for 
the first base) was added. Correlation between Cy3-labeled primer position and C-Cy3 was low (data 
not shown). In another control, after filling in the string of A residues, the U residue, G residues, and 
U-Cy3 (wrong residue for the second base) was added. The correlation observed from the results in 
this experiment was low (at the noise level; data not shown). Using different oligonucleotide 

25 templates, the experiment scheme was repeated for successive incorporations of other combinations of 
two or more nucleotides (data not shown). The results confirmed correct incorporation of the first 
labeled nucleotide with high signal-to-noise ratio and subsequent incorporations of more nucleotides 
with a relatively lower signal-to-noise ratio. Taken together, these data indicate feat fee observed 
results (e.g., as shown in Figure 12) are not due to artifacts, but rather demonstrate efficacy of base- 

30 by-base analysis of the experimental scheme. 
Example 8 

Two Color Incorp oration: Fluorescence Resonance Enerpv Transfer 

[03911 This experiment demonstrate incorporation of two different fluorescent labels into fee same 
immobilized polynucleotide template through detection of fluorescence resonance energy transfer 
35 (FRET). In this experiment, two fluorescent labels were used (Cy5 and Cy3), and FRET from dUTP- 
Cy3 (donor) to dCTP-Cy5 (acceptor) was examined at the single molecule level as shown in Figure 
13. 
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(0392] Image of the DNA template with the labeled primer is shown in the left paneL Detection of 
FRET after incorporation of the two labels is provided in the right image. Correlation between the 
template location and the incorporation signals is shown in the middle graph As indicated, there is a 
bigi correlation between the template location and the incorporated nucleotide location. A control 
5 experiment was performed in which no polymerase is present Results from the control experiment 
produced a low correlation between the template location and location of labeled nucleotides. FRET 
experiment provides particularly high signal to noise ratio as there is almost no signal from 
nonspecific incorporation of dyes to the surface. 

(0393] When the two labels were incorporated into a primer at close vicinity, te. , at a few 
1 0 nanometers apart, a single molecule FRET signal was detected (Figure 1 4). To detect the FRET 
signal, the optic setup was altered. A image splitter was added so that the same area was imaged 
twice(Optical Insights LTD, micro imager device), m one channel, a fluorescence filter detected only 
the donor (cy3) fluorescence. In the other channel, a filter for the acceptor (Cy5) was placed With this 
setup individual spots were examined after incorporation Figure 15 further indicates that the FRET 
1 5 detection scheme allows measurement of incorporation rate with a nice signal to noise ratio. 
Example 9 

[0394] Figure 24 illustrates choking using Cy5-labeled nucleotides. The reaction conditions used 
were as follows: 

[0395] Detection and Data Analysis . An upright microscope (BH-2, Olympus, Melville, NY) 

20 equipped- with total internal reflection (TIR) illumination served as a platform for the experiments. 
Two laser beams, 635 (Coherent, Santa Clara, CA) and 532 nm (Brimrose, Baltimore), with nominal 
powers of 8 and 10 mW, respectively, were circularly polarized by quarter-wave plates and undergo 
TIR in a dove prism (Edmund Scientific, Barrington, NJ). The prism was optically coupled to the 
fused silica bottom (Esco, Oak Ridge, NJ) of a hybridization chamber (Sigma) so that evanescent 

25 waves illuminated up to 150 nm above the surface of the fused silica. An objective (DPlanApo, 100 
UV 1 .3oil, Olympus) collected the fluorescence signal through the top plastic cover of the chamber, 
which was deflected by the objective to .40 :m from the silica surface. An image splitter (Optical 
Insights, Santa Fe, NM) directed the light through two bandpass filters (630dcxr, HQ585/80, 
HQ690/60; Chroma Technology, Brattleboro, VT) to an intensified charge-coupled device (I- 

30 PentaMAX; Roper Scientific, Trenton, NJ), which recorded adjacent images of a 120- x 60-mi section 
of the surface in two colore. Typically* eight exposures of 0.5 sec each were taken of each field of 
view to compensate for possible intermittency in the fluorophore emission. Custom IDL software was 
modified to analyze the locations and intensities of fluorescence objects in the intensified charge- 
coupled device pictures. 

35 Sample Preparation 

[0396] The target DNA was composed of a DNA primer, [Cy 3 □ 5N □ tagaacctccgtgt-3N] , which 
was annealed to template 3 [3N-atettggaggcacaCTACIX^^ (all 
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oligpnucleottdes were synthesized by Operon, Technologies, Alameda, CA). This template was 
designed so that labeled nucleotides would be incorporated in adjacent positions. Surface chemistry 
based on polyelectrolytes and biotm-streptavidin bonding was used to anchor the DNA molecules to 
the fused silica surface of the hybridization chamber and to minimize nonspecific binding of the 
5 nucleotides to the surface. Slides were sonicated in 2% MICRO-90 soap (Cole-Parmer, Vernon Hills, 
IL) for 20 min and men cleaned by immersion in boiling RCA solution (6:4:1 high-purity H2O/30% 
NH4OH/30%v H202) for 1 h. They were men immersed alternately in polyallylamine (positively 
charged) and polyacrylic acid (negatively charged; bom from Aldrich) at 2 mg/ml and pH 8 for 10 
min each and washed intensively with distilled water in between. The carboxyl groups of the last 

1 0 polyacrylic acid layer served to prevent the negatively charged labeled nucleotide from binding to the 
surface of the sample. In addition, these functional groups were used for further attachment of a layer 
of biotin. The slides were incubated with 5 mM biotm-arnine reagent (Biotm-EZ-Unk, Pierce) for 10 
min in the presence of 1 -[3^dimemylammo)propyl]-3^mylcarbodiiinide hydrochloride (EDC, 
Sigma) in MES buffer, followed by incubation with Streptavidin Plus (Prozytne, San Leandro, CA) at 

15 0. 1 mg/ml for 15 min in Tris buffer. The bionnykted DNA templates were deposited onto the 
streptavidin-coated chamber surface at 10 pM for 10 min in Tris buffer that contained 100 mM 
MgC12. For incorporations, the reaction solution contained Klenow fragment Exo-minus polymerase 
(New England Biolabs) at 10 nM (100 units/ml) in the reaction buffer (EcoPol buffer, New England 
Biolabs) and a nucleotide triphosphate. dATP, dGTP, dTTP and dCTP from Roche Diagnostics, 

20 dCTP-Cy3, dUTP-Cy3, and dUTP-Cy5 from Amersham Pharmacia, dCTP-Cy5, dATP-Cy3, dGTP- 
Cy3, dATP-Cy5, and dGTP-Cy5 from Peridn-Elmer, and dCTP-Alexa647 from Molecular Probes 
were used at 0.2 -M for the Cy3-labeled and 0 JS :M for the Cy5-labeled and unlabeled nucleotides. 
Incubation times were 6-15 min, with the longer incubation time at the later stages of the experiment. 
To reduce bleaching of the fluorescence dyes, an oxygen scavenging system was used during all green 

25 Ulumination periods, with the exception of the bleaching of the primer tag. 
Reagent Exchange Sequence for Single-Pair FRET Sequencing 

[0397] The positions of the anchored Cy3-primed DNA were recorded, and men the tags were 
bleached by the green laser ulumination (Figure 24a). dUTP-Cy3 and polymerase were introduced 
and washed out An image of the surface was then analyzed for incorporated U-Cy3. If there were 

30 none, the process was repeated with dCTP-Cy3. If there was still no incorporation, incubation was 
repeated with unlabeled dATP and dGTP and then cycled again from the beginning until the first 
fluorescently labeled base had been incorporated The Cy3 dye of mis incorporated nucleotide was 
kept unbleached Next, a mix of dATP, dGTP, and polymerase was incubated to ensure mat the 
primer was extended until the next A or G of the template. At this point, the reagents were switched 

35 to Cy5-labeled nucleotides or Alexa-647, a Cy5 analogue (Molecular Probes). The incorporation and 
observation process was repeated, except that each observation with green ulumination was followed 
by an observation with red iUumination to photobleach any incorporated Cy5 fluorophores. After 
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bleaching the acceptor, the mix of dATP, dGTP, and polymerase was again incubated, washed out, 
and die sample observed briefly with green illumination to record the recovery of (he donor. 
[0398] Figure 24 (a) is a schematic illustrating extension of template 3, which includes adjacent 
incorporations of labeled dCTP and dUTP. Figure 245(b) shows a sequence trace tram an experiment 
5 with template 3. The label at each column indicates the last nucleotide to be incubated, and successful 
incorporation events are marked with an arrow. Figure 24(c) shows the FRET efficiency as a function 
of the experimental epoch. 

[0399] Yield was reduced to about 10% for the second incorporation, indicating that, in most cases, 
the polymerase was halted or choked and elongation was prevented due to the bulkiness of the 
1 0 adjacent label. Use of dyes larger than Cy5 can be used to reduce yield further, halting polymerizing 
agent in all cases due to the increased bulkiness of an incorporated label. 
Example 10 

[0400] An exemplified scheme of coating a substrate with PEM for immobilizing polynucleotide is 
as follows: 

1 5 [0401 ] Carboxylic acid groups are negatively charged at pH 7, and are a common target for 

covalent bond formation. Terminating the surface with carboxylic acid groups generates a surface 
which is both strongly negatively-charged and chemically reactive. In particular, amines can link to 
carboxylic acid groups to form amide bonds, a reaction catalyzed, for example, by carbodiirnides. 
Thus, a molecule with biotin at one end, a hydrophilic spacer, and an amine at the other end can be 

20 used to terminate the surface with biotin. 

[0402] An avidin molecule is capable of binding up to four biotin molecules. This means that 
avid in, and its derivative Streptavidin, is capable of converting a biotin-terminated surface to a surface 
capable of capturing biotin. Streptavidin, which carries a slight negative charge, can be used then to 
attach the polynucleotide templates to be analyzed to the surface by using a biotinylated primer. A 
25 buffer with a high concentration of multivalent salt can be used in order to screen the repulsion of the 
negatively charged surface for the negatively-charged DNA. 

[0403] To coat the polyelectrolyte multilayer, the gjass cover slips can be first cleaned with high 
purity H 2 0 (H 2 0 deionized to 18.3 MOhm-cm and filtered to 02 um) and a RCA Solution (6:4:1 
mixture of HIGH PURITY H 2 0, (30% NH4OH), and (30% H2O2)). The cover slips can be then 
30 sonicated in 2% Micro 90 detergent for 20 minutes. After rinsing thoroughly with high purity H 2 0, 
the cover slips can be stirred in gently boiling RCA solution tor at least 1 hour, and rinsed again with 
high purity H 2 0. 

[0404] After cleaning, the glass cover slips can be submerged in PAH solution (Poly(allylamine) 
(PAR, +): 2 mg/ml in high purity H 2 0, adjusted to pH 7.0) and agitated for at least 1 0 minutes. The 
35 cover slips can then be removed from PAH and washed with BP H 2 0 by submerging in BP H 2 0 with 
agitation, repeated for at least three times. The treatment can continue by agitation in a PAcr solution 
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(Poly(acrylic acid) (PAcr, -) : 2 mg/ml in HIGH PURITY HjO, adjusted to pH 7 . 0) for at least 1 0 
minutes and washed with HIGH PURITY H 2 0. The treatment steps can then be repeated once. 
[0405] After PEM coating, the PEM coated glass can be incubated with an EDC/BLCPA solution 
for 30 minutes. The EDC/BLCPA solution can be prepared by mixing equal amounts of 50 mM EDC 
5 solution (in MES buffer) and 50 mM BLCPA (in MES buffer) and diluting to 5mM in MBS buffer. 
Tbe glass can men be rinsed with 10 mM Tris-NaCl and incubated with 0. 1 mg/ml streptavidin 
solution for 1 hour. After washing with 10 mM Tris-NaCl, the glass can be incubated with a solution 
containing the polynucleotide template (for example, 10" 7 M in Tris 100 mM MgCl 2 ) for 30 minutes. 
The glass can be again rinsed thoroughly with 10 mM Tris-NaCl. 

1 0 [0406] For in-situ attachment, the microfluidic substrate can be bonded to the glass cover slip by 
HCl-assisted bonding. Essentially, the chips can be first washed with a surfactant (e.g., first with 
HIGH PURITY H 2 0, then in 0. 1 % Tween 20, men rinse again with HIGH PURITY H 2 0). The 
washed microfluidic chips can then be put on the glass cover sups with a few microliters of dilute HC1 
(eg. , 1% HC1 in HIGH PURITY H 2 0), followed by baking at 37°C for 1-2 hours. Such treatment can 

1 5 enhance the bond strength to glass (<?.#, >20 psi pressure) without increasing nonspecific adsorption. 
[0407] Following HC1 treatment, PEM formation, biotmylation, and streptavidmylation, template 
attachment can be performed using essentially the same reagents and methods as described above for 
ex-situ attachment, except that the solutions can be injected through the channels by pressure instead 
of just being aliquoted onto the substrate surface. 

20 Example 11 

[0408] Figure 19 illustrates the advantage of short-cycle sequencing with respect to avoiding long 
homopolymer reads. Figure 19a illustrates a simulated analysis of 10 target polynucleotides using 
non-short-cycle sequencing (Example 1 la), whereas Figure 19b illustrates a simulated analysis of the 
same number of target polynucleotides using short-cycle sequencing (Example 1 lb). 

25 [0409] The simulations were performed as follows: an Excel spreadsheet was opened and 

"Customize. . selected from the 'Tools" menu of the Excel toolbar. The "Commands" tab was 
selected and, after scrolling down, "Macros" was clicked. The "smiley race" that appeared in the 
right panel was dragged to the toolbars on top of the spreadsheet The "Customize" box was closed 
and the "smiley race" clicked once. From the list of subroutines mat appeared, 

30 'TrusWorkbook^ain_Line." was selected. The program was run by clicking again on the "smiley 
race." 

[0410] Input values were then entered into the tabbed sheet called "In Out" There were three 
input values: 

[041 1] The first input value corresponded to the period of time allowed for incorporation reactions 
35 of provided nucleotides into the growing complementary strands of the polynucleotides to be 

analyzed Ibis period was conveniently measured in half lives of the incorporation reaction itself 
Each cycle of mcorporation was srmulatedly halted after a period of time, representing, for example, 
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tfae time when unincorporated nucleotides would be flushed out or the incorporation reactions 
otherwise halted. 

[0412] The second input value corresponds to die number of times each cycle of incorporation was 
repeated. That is, the number of times the steps of providing nucleotides, allowing incorporation 
5 reactions into the complementary strands in the presence of polymerizing agent, and then halting the 
incorporations are repeated The nucleotides were simulatedly provided as a wash of each of dATPs, 
dGTPs, dTTPs, and dCTPs. The program then recorded which nucleotides were incorporated, 
corresponding to a detection step of detecting incorporation. 







iS 







I Mill! 



1 0 analyzed in the simulation. The program allowed up to 1 100 target polynucleotide molecules to be 
analyzed in a given simulation. 

[0414] After the program was started, as described above, the program first generated the inputed 
number of strands composed of random sequences. The program then simulated hybridization and 
polymerization of the correct base of each incorporation reaction, based on the generated sequence of 

15 die target polynucleotide templates. The program continued these simulated reactions for the allowed 
amount of simulated time, determined by the inputed number of half lives. Statistics of the simulation 
were then computed and reported, including the longest strand, the shortest strand, and the average 
length of all strands, as well as the fraction of strands extended by at least 25 nucleotide 
incorporations, as discussed in more detail below. 

20 [0415] In the first part of this simulation, Example 1 la, the input values used were a cycle period ' 
of 10 half lives, 12 repeats of the cycle, and 10 target polynucleotide strands. 
[0416] Figure 19a illustrates the results obtained Homopolymers stretches which occured in the 
same simulated complementary strand are highlighted in magenta wherever 2 nucleotides of the same 
base type were incorporated in a row, and in cyan wherever more than two nucleotides of the same 

25 base type were incorporated in a row. 

[0417] Figure 19a illustrates that the output values included the longest extended complementary 
strand obtained during the simulation (Longest extension in the ensemble of molecules); the shorted 
extended complementary strand obtained during the simulation (Shortest extension in the ensemble of 
molecules); and the average extension. These numbers represent the greatest number of 

30 incorporations into any of the 10 simulatedly growing complementary strands, the smallest number of 
incorporations for any of the 10, and the average number of incorporations for the 10. Figure 19a 
indicates that the values obtained for Example 1 la were 37 incorporations in the longest extension, 25 
in the shortest, and 30.00 as the average number of incorporations. 

[041 8] The output values also provided information on the number of incorporations mat occurred 
35 in each of growing complementary strands during each cycle period of the shnulatioiL For example, 
Figure 19a indicates that for the input values of Example 11a, the percentage of growing stands 
ided by two or more nucleotides in a homopolymer stretch was 100.0%; and the percentage of 
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growing strands extended by three or more nucleotides in a homopolymer stretch was 60.0%. That 
is, using a cycle period of 1 0 half lives resulted in only 40% of the complementary strands being 
extended by two or less nucleotides in a homopolymer stretch per cycle of incorporation. 
[0419J Further, output values also indicated the total number of incorporations for each of the 
5 growing strands for the total number of repeated cycles. This represents the length of the sequence of 
target polynucleotide analyzed. Figure 19a illustrates that in Example 1 la, 100.0% of the 10 target 
polynucleotides of the simulation were extended by at least 25 incorporated nucleotides. This 
illustrates that using a cycle period of 10 half lives, and repeating the cycles 12 times, allowed 
analysis of a 25 base sequence of 10 target polynucleotides. 

1 0 [0420] In the second part of this simulation, Example 1 lb, the input values used were a cycle 
period of 0.8 half lives, 60 repeats of the cycle, and 10 target polynucleotide strands. 
[0421] Figure 19b illustrates the results obtained. Homopolymers stretches which occurred in the 
same simulated complementary strand are highlighted in magenta wherever 2 nucleotides of the same 
base type were incorporated in a row, and in cyan wherever more than two nucleotides of the same 

1 5 base type were incorporated in a row. 

[0422] Figure 19b illustrates mat the output values included the longest extended complementary 
strand obtained during the simulation (Longest extension in the ensemble of molecules); the shorted 
extended complementary strand obtained during the simulation (Shortest extension in the ensemble of 
molecules); and the average extension. These numbers represent the greatest number of 
20 incorporations into any of the 10 simulatedly growing complementary strands, the smallest number of 
incorporations for any of the 10, and the average number of incorporations for the 10. Figure 19b 
indicates that the values obtained for Example 1 lb were 37 incorporations in the longest extension, 26 
in the shortest, and 32.00 as the average number of incorporations. 

[0423] The output values also provided information on the number of incorporations mat occurred 
25 in each of growing complementary strands during each cycle period of the simulation. For example, 
Figure 19b indicates that for the input values of Example 1 lb, the percentage of growing stands 
extended by two or more nucleotides in a homopolymer stretch was 80.0%; and the percentage of 
growing strands extended by three or more nucleotides in a homopolymer stretch was 10.0%. That 
is, using a cycle period of 0.8 half lives resulted in 90% of the complementary strands being extended 
30 by two or less nucleotides per cycle of mcorporatioa 

[0424] Output values also indicated the total number of mcorporations for each of the growing 
strands for the total number of repeated cycles. As in Example 11a, this represents the length of me 
sequence of target polynucleotide analyzed Figure 19b illustrates that in Example 1 lb, 100.0% of the 
10 target polynucleotides of me simulation were again extended by at least 25 incorporated 
35 nucleotides. This illustrates that using a cycle period of 0.8 half lives, and repeating the cycles 60 
times, allowed analysis of a 25 base sequence of 10 target polynucleotides. 
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[0425] Comparing the two simulations, it will be appreciated by those in the art that the use of 
short-cycles of sequencing overcame issues of reading long repeats of homopolymer stretches in 
sequencing by synthesis, without using blocking moieties, as only a few nucleotides were 
incorporated per cycle. Comparing Examples 1 la and 1 lb, the long cycles in 1 la resulted in 40% of 
5 the extended complementary strands having two or less homopolymer nucleotide incorporations per 
cycle. Conversely, the short cycles in 1 lb resulted in 90% of the extended complementary strands 
having two or less homopolymer nucleotide incorporations per cycle, facilitating quantification. That 
is, as explained more thoroughly above, shorter reads can be quantitated to determine the number of 
nucleotides incorporated, for example, where the nucleotides are of the same base type and bear the 

1 0 same labeling moiety. That is, methods known in the art can correlate increases in the signal intensity 
from the same labeling moieties to determine the number of incorporated nucleotides when the 
number is relatively small For example, imaging systems known in the art can reliably distinguish 
the difference between one versus two fluorescent labeling moieties on consecutively-incorporated 
nucleotides, and/or two versus three fluorescent labeling moieties on consecutively-incorporated 

1 5 nucleotides. Moreover, signals from the incorporated nucleotides can be reduced, e.g., by bleaching 
or removal of the signal generating moiety of the labeling moiety, before carrying out the next cycle 
of incorporations or after the number of cycles resulting in too large numbers of incorporated 
nucleotides (that is, numbers too high to be accurately quantitated based on increasing signal 
intensity). 

20 [0426] Comparing Examples 2 la and lib also indicated that a greater number of repeated cycles 
were needed to analyze a given length of sequence when using shorter cycles. That is, the 10 half life 
cycle was repeated 12 times to result in 100.0% of the 10 complementary strands being extended by at 
least 25 nucleotides, whereas the 0.8 half life cycle was repeated 60 times to obtain this same result 
and thereby analyze the 25 nucleotides sequence. 

25 [0427] Nonetheless, many aspects of the repeated cycles may be automated, for example, using 
microfluidics for washing nucleotides to sites of anchored target polynucleotides, and washing out 
unincorporated nucleotides to halt each cycle. 
Example 12 

[0428] Figure 20 illustrates yet another simulated analysis of a number of target polynucleotides 
30 using short-cycle sequencing. The simulation was run using the program described in Examples 11a 
and 1 lb but using a larger number of target polynucleotides. 

[0429] That is, in this simulation, the input values used were a cycle period of 0.8 half lives, 60 
repeats of the cycle, and 200 target polynucleotide strands. 

[0430J Figure 20 illustrates the results obtained. Homopolymers stretches which occured in the 
35 same simulated complementary strand are highlighted in magenta wherever 2 nucleotides of the same 
base type were incorporated in a row, and in cyan wherever more than two nucleotides of the same 
base type were mcorporated in a row. 
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[0431] Hie output values obtained were 48 incorporations in the longest extended complementary 
strand, 20 in the shortest, and 32.00 as the average number of incorporations for the 200 stimulatedly 
extended complementary strands. 

[0432] Further, the percentage of growing stands extended by two or more nucleotides in a 
homopolymer stretch was 78.5%; and the percentage of growing strands extended by three or more 
nucleotides in a homopolymer stretch was 4.0%. That is, using a cycle period of 0.8 half lives 
resulted in 96.0% of the complementary strands being extended by two or less nucleotides in a 
homopolymer stretch per cycle of incorporation. Moreover, 95.5% of the 200 target polynucleotides 
of the simulation were extended by at least 25 incorporated nucleotides, while 100% were extended 
by at least 20 nucleotides. This illustrated that using a cycle period of 0.8 half lives, and repeating the 
cycles 60 times, allows analysis of a 20 base sequence of 200 target polynucleotides. 
[0433] The invention may be embodied in other specific forms without departing from the spirit or 
essential characteristics thereof. The foregoing embodiments are therefore to be considered in all 
respects illustrative rather than limiting on the invention described herein. Scope of the invention is 
thus indicated by the appended claims rather than by the foregoing description, and all changes which 
come within the meaning and range of equivalency of the claims are therefore intended to be 
embraced therein. 
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What is claimed is: 

1 1. A method for analyzing a nucleic acid sequence, die method comprising the steps of: 

2 exposing four types of nucleotides wherein at least one of said types of nucleotides 

3 comprises a detectable label to a nucleic acid duplex comprising a template and a 

4 primer, 

5 permitting incorporation of a labeled nucleotide into said primer in the presence of a 

6 polymerizing agent; and 

7 detecting said incorporation in real time, thereby analyzing said nucleic acid sequence. 

1 2. The method of claim 1 , wherein said detecting step is carried out at a rate as fast or faster than 

2 the rate at which said labeled nucleotide is incorporated into said primer. 

1 3. The method of claim 1 , wherein said detecting step is carried out by imaging said labeled 

2 nucleotide upon incorporation. 

1 4, The method of claim 1, further comprising repeating said permitting and detecting steps. 

1 5. The method of claim 1 , wherein said label is attached to said nucleotide via a linker. 

1 6. Hie method of claim 5, wherein said linker is cleavable. 

1 7. The method of claim 1 , wherein said label is selected from a donor fluorophore and an acceptor 

2 fluorophore. 

1 8. The method of claim 1 , further comprising the step of anchoring said duplex to a surface of a 

2 substrate. 

1 9. Hie method of claim 8, further comprising the step of locahzmg said duplex on a surface at 

2 individually-addressable locations. 

1 10. A method for analyzing a sequence of a randomly-localized target polynucleotide by 

2 synthesizing a complementary strand, the method comprising the steps of: 

3 permitting random localization of said target polynucleotide on a surface of a substrate; 

4 providing a labeled nucleotide; 

5 allowing incorporation of said labeled nucleotide into said complementary strand in the 

6 presence of a polymerizing agent; and 

7 detecting said incorporation, thereby analyzing said sequence of said target 

8 polynucleotide. 

1 11. The method of claim 1 0, wherein said detecting step identifies a location of said randomly- 

2 localized target polynucleotide. 

1 12. Hie method of claim 10, wherein said target polynucleotides are localized on a surface of a 

2 substrate at a density of at least 1 ,000 target polynucleotides per cm 2 . 

1 13. A method for forming a spatially addressable array, which method comprises determining the 

2 sequences of a plurality of polynucleotide molecules in said array in which the surface density of said 

3 plurality is such that a molecule in said array is in an optically resolvable area. 
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The method of claim 13, wherein said density is at least 1,000 target polynucleotides per cm 2 . 
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steps of: 

anchoring a nucleotide duplex comprising a template and a primer to a surface of a 
substrate; 

providing two or more types of labeled nucleotide, said labeled nucleotide cornprising a 
non-cleavable label and a blocking moiety, wherein said non-cleavable label is attached 
to said nucleotide via an -O-ethoxy linkage; 

allowing incorporation of said nucleotide into said primer in the presence of a 
polymerizing agent; and 

detecting incorporation; thereby analyzing said sequence of said target polynucleotide. 

1 6. The method of claim 1 5, further comprising the step of repeating said providing, said 
allowing, and said detecting steps. 

1 7. The method of claim 15, wherein said polymerizing agent is a reverse transcriptase. 
The method of claim 15,. wherein said polymerizing agent is a thermostable polymerase. 
The method of claim 15, wherein said polymerizing agent is a thermodegradable polymerase. 
The method of claim 15, wherein said label is attached to said nucleotide via a linker. 

A method for analyzing a sequence of a target polynucleotide, the method comprising the steps 
of: 

exposing a nucleotide comprising a label and a blocking moiety to a nucleic acid duplex 
comprising a template and a primer, 

allowing incorporation of said nucleotide into said primer in the presence of a 
polymerizing agent; 

bleaching said label and cleaving said blocking moiety in a single step; and 
detecting incorporation, thereby anaryzing said sequence of said target polynucleotide. 
The method of claim 21, wherein said bleaching step comprises chemical bleaching and said 
cleaving step comprises chemical cleaving. 

23. The method of claim 22, wherein said bleaching step comprises photo-bleaching and said 
cleaving step comprises photo-cleaving. 

24. The method of claim 21 , wherein said labeled nucleotide comprises a single type of nucleotide 
and said exposing, allowing and detecting steps are repeated utilizing a different type of nucleotide 
until incorporation occurs. 

25. The method of claim 24, further comprising the step of washing to remove unincorporated 
reagents between successive exposing steps. 

26. A method for analyzing a sequence of a target polynucleotide, the method comprising the steps 
of: 
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3 providing a labeled nucleotide, said labeled nucleotide comprising a quenching moiety 

4 on at least one of a non-a- phosphate of said nucleotide and a fluorescent moiety; 

5 allowing incorporation of said nucleotides into said complementary strand in the 

6 presence of a polymerizing agent; and 

7 detecting incorporation, thereby analyzing said sequence of said target polynucleotide. 

1 27. The method as recited in claim 26, wherein said quenching moiety is attached at said y- 

2 phosphate. 
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